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MOLECULES FOR DISEASE DETECTION AND TREATMENT 

TECHNICAL FIELD 
This invention relates to nucleic acid and amino acid sequences of molecules for disease 
5 detection and .treatment and to the use of these sequences m the diagnosis, treatment, and prevention 
of cell proliferative, autounmune/inflanomatory, developnoental, and neurological disorders, and 
infections, and in the assessment of the effects of exogenous compounds on the expression of nucleic 
acid and ammo acid sequences of molecules for disease detection and treatment. 

10 BACKGROUND OF THE INVENTION 

It is estimated that only 2% of mammalian DNA encodes proteins, and only a small fraction of 
the genes that encode proteins are actually expressed in a particular cell at any time. The various 
types of cells in a multicellular organism differ dramatically both in structure and function, and the 
identity of a particular cell is conferred by its unique pattern of gene expression. In addition, different 

15 cell types express overlapping but distinctive sets of genes fliroughout development. Cell growth and 
proliferation, cell differentiation, tiie inunune response, apoptosis, and odier processes that contribute 
to organismal development and survival are governed by regulation of gene expression. Appropriate 
gene regulation also ensures that cells function efGcientiy by expressing only those genes whose 
functions are required at a given time. Factors tiiat influence gene expression include extracellular 

20 signals fliat mediate cell-cell communication and coordinate the activities of different cell types. Gene 
expression is regulated at tfie level of DNA and RNA transcription, and at the level of mRNA 
translation. 

Aberrant expression or mutations in genes and tiieir products may cause, or increase 
susceptibility to, a variety of human diseases such as cancer and other cell proliferative disorders. Hie 

25 identification of fliese genes and their products is the basis of an ever-e3q)anding effort to find markers 
for early detection of diseases and targets for their prevention and treatment. For example, cancer 
represeiits a typ& of cell proliferative disorder that affects nearly every tissue in the body. The 
development of cancer, or oncogenesis, is often correlated with the conversion of a normal gene into a 
cancer-causing gene, or oncogene, through abnormal expression or mutation. Oncoproteins, the 

30 products of oncogenes, include a variety of molecules that influence ceU proliferation, such as growtfi 
factors, growtii factor receptors, intracellular signal transducers, nuclear transcription factors, and 
cell-cycle control proteins. In contrast, tumor-suppressor genes are involved in inhibitmg cell 
proliferation. Mumtions which reduce or abrogate the function of tumor-suppressor genes result in 
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aberrant cell proliferation and cancer. Thus a wide variety of gpnes and their products have been 
found that are associated with cell proliferative disorders such as cancer, but many more may exist 
that are yet to be discovered. 

DNA-based arrays can provide an efficient, high-tinoughput method to examine gene 
5 expression and genetic variability. For example, SNPs. or single nucleotide polymorphisms, are the 
most common type of human genetic variation. DNA-based arrays can dramatically accelerate the 
discovery of SNPs in hundreds and even ttiousands of genes. Likewise, such arrays can be used fcM* 
SNP genotyping in which DNA samples from individuals or populations arc assayed for the presence 
of selected SNPs. These approaches will ultimately lead to the systematic identification of all genetic 

10 variations in the human genome and tiie correlation of certain genetic variations with disease 

susceptibility, responsiveness to drug treatments, and oflier medically relevant information. (See, for 
example, Wang, D.O. et al. (1998) Science 280:1077-1082.) 

DNA-based array technology is especially important for tiie rapid analysis of global gene 
expression patterns. For example, genetic predisposition, disease, or thempeutic treatment may 

15 directly or indirectiy affect the expression of a large number of genes in a given tissue. In this case, it 
is useful to develop a profile, or transcript image, of all the genes that are expressed and the levels at 
which tiiey are expressed in that particular tissue. A profile generated from an individual or population 
affected with a certain disease or undergoing a particular therapy may be compared with a profile 
generated from a control individual or population. Such analysis does not require knowledge of gene 

20 function, as the expression profiles can be subjected to mathematical analyses which simply treat each 
geue as a marker. Furthermore, gene expression profiles may help dissect biological pathways by 
identifying all the genes expressed, for example, at a certain developmental stage, in a particular 
tissue, or in response to disease or treatment. (See, for example. Lander, E.S. et al. (1996) Science 
274:536-539.) 

25 Certain genes arc known to be associated with diseaises because of their chromosomal 

location, such as the genes in the myotonic dystrophy (DM) regions of mouse and human. The 
mutation underlying DM has been localized to a gene encoding die DM-kinase protem, but another 
active gene, DMR-N9, is m close proxhnity to flie DM-kmase gene (Jansen, G. et al. (1992) Nat. 
Genet. 1:261-266). DMR-N9 encodes a 650 amino acid protein that contains WD repeats, motifs 

30 found in cell signalmg proteins. DMR-N9 is expressed in all neural tissues and in the testis, suggesting 
a role for DMR-N9 in the manifestation of m^tal and .testicular symptoms in severe cases of DM 
(Jansen, G. et al. (1995) Hum. Mol. Genet 4:843-852). 

Other genes are identified based upon their expression patterns or association with disease 
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syndromes. For example, autoantibodies to subcellular organelles are found in patients with systemic 
rheumatic diseases. A recently identified protein, golgin-67, belongs to a family of Golgi autoantigens 
having alpha-helical coiled-coil domains (Eystadiioy, T. et al. (2000) J. Autoimmun. 14:179^187). The 
Stac gene was identified as a brain specific, developmentally regulated gene. The Stac protein 
5 contains an SH3 domain, and is thought to be involved in neuron-specific signal transduction (Suzuki, 
H. et al. (1996) Biochem. Biophys, Res. Commun. 229:902-909). 
Structural and Cvtoskdeton-Assodated Proteins 

The cytoskeleton is a cytoplasmic network of protem fibers that mediate cell shape, structure, 
and movement The cytoskeleton si^ports the cell membrane and forms tracks along which 

10 organelles and other elements move in the cytosol. The cytoskeleton is a dynamic structure that 
allows cells to adopt various shapes and to carry out directed movements. Major cytoskeletal fibers 
include die microtubules, the microfilaments, and the intermediate filaments. Motor proteins, including 
myosin, dynein, and kinesin, drive movement of or along the fibers. Tlie motor protein dynamin drives 
the formation of membrane vesicles. Accessory or associated proteins modify the stmcture or activity 

IS of the fibers while cytoskeletal membrane anchors coimect the fibers to the cell membrane. 
Microtubules and Assodated Proteins 
Tubulins 

Microtubules, cytoskeletal fibers with a diameter of about 24 nm, have multiple roles in the 
cell. Bundles of microtubules form cilia and flagella, which axe whip-like extensions of the cell 

20 membrane that are necessary for sweeping materials across an epithelium and for swinuning of 

sperm, respectively. Marginal bands of microtubules in red blood cells and platelets ate important for 
these cells' pliability. Organelles, membrane vesicles, and proteins are transported in the cell along 
tracks of microtubules. For example, microtubules nm through nerve cell axons, allowing bi- 
directional transport of materials and membrane vesicles between the cell body and the nerve 

25 terminal. Failure to supply the nerve terminal with these vesicles blocks the transmission of neural 
signals. Microtubules are also critical to chromosomal movement during cell division. Both stable and 
short-lived populations of microtubules exist in the cell. 

Microtubules are polym^ of GTP-bindmg tubulin protein subunits. Each subunit is a 
heterodimer of a- and P- tubulin, multiple isoforms of which exist The hydrolysis of OTP is linked to 

30 the addition of tubulin subunits at the end of a microtubule. Ibe subunits interact head to tail to form 
protofilaments; the protofilaments interact side to side to form a microtubule. A microtubule is 
polarized, one end ringed with a-tubulin and the other with p-tubulin, and die two ends differ in their 
rates of assembly. Generally, each microtubule is composed of 13 protofilaments although 11 or IS 
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protofUamenMnicrotubules are sometimes found. Cilia and flagella contain doublet microtubules. 
Microtubules grov^ from specialized structures known as centrosomes or microtubule-organizing 
centers (MTOCs). MTOCs may contain one or two centrioles, which are pinwheel arrays of triplet 
microtubules. The basal body, the organizing center located at the base of a cilium or fla^llum, 
5 contains one centriole. Gamma tubulin present in the MTOC is in:q3ortant for nucleating the 
polymerization of a- and P- tubulin heterodimers but does not polymerize into microtubules. 
Microtubule-Associated Proteins 

Microtubule-associated proteins (MAPs) have roles in the assembly and stabilization of 
microtubules. One major family of MAPs, assembly MAPs, can be identified in neurons as well as 

10 non-neuronal cells. Assembly MAPs are responsible for cross-linking microtubules in the cytosol. 
These MAPs are organized into two domains: a basic microtubule-binding domain and an acidic 
projection domain. The projection domain is the binding site for membranes, intermediate filaments, or 
other microtubules. Based on sequence analysis, assembly MAPs can be further grouped into two 
types: Type I and Type H. Type I MAPs, which include MAPI A and MAPIB, are large, filamentous 

15 molecules that co-purify with microtubules and are abundantly expressed in brain and testes. Type I 
MAPs contain several repeats of a positively-charged amino acid sequence motif fliat binds and 
neutralizes negatively charged tubulin, leading to stabilization of microtubules. MAPI A and MAPIB 
are each derived from a single precursor polypeptide that is subsequently proteolytically processed to 
generate one heavy chain and one light chain. 

20 Anotbea: light chain, LC3. is a 16.4 kDa molecule that binds MAPI A, MAPIB, and 

microtubules. It is suggested that 1X3 is synthesized from a source other than the MAPI A or 
MAPIB transcripts, and fliat the expression of LC3 may be important in regulating flbie microtubule 
binding activity of MAPI A and MAPIB during cell proliferation (Mann, S.S. et al. (1994) J. Biol. 
Chem. 269:11492-11497). 

25 Type H MAPs, which include MAP2a, MAP2b, MAE2c, MAP4, and Tau, are characteiized 

by three to four copies of an 18-residue sequence in the microtubule-binding domain. MAP2a, 
MAP2b, and MAP2c are found only in dendrites, MAP4 is found in non-neuronal cells, and Tau is 
found in axons and dendrites of nerve cells. Alternative splicing of the Tau mRNA leads to the 
existence of multiple forms of Tau protein. Tau phosphorylation is altered in neurodegenerative 

30 disorders such as Alzheimer's disease. Pick's disease, progressive supranuclear palsy, cortlcobasal 
degeneration, and familial frontotemporal dementia and Parkinsonism linked to chromosome 17. The 
altCTed Tau phosphorylation leads to a collapse of the microtubule network and flie formation of 
intraneuronal Tau aggregates (Spillantini, M.G. and M. Goedert (1998) Trends Neurosci. 21:428-433). 
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Another microtubule associated protein, STOP (stable tubule only polypq)tide), is a 
calmodulin-iegulated protein tbat regulates stability (Denarier, E. et al. (1998) Biochem. Biophys. Res. 
Conunun. 24:791-796). In order for neurons to maintain conductive connections over great distances, 
they rely upon axodendritic extensions, which in turn are supported by microtubules. STOP proteins 
5 function to stabilize the microtubular network. STOP proteins are associated wi& axonal 

microtubules, and are also abundant in neurons (Guillaud, L. et al. (1998) J. Cell Biol. 142:167-179). 
STOP proteins are necessary for normal neurite formation, and have been observed to stabilize 
microtubules, in vitro, against cold-, calcium-, or drug-induced dissassembly (Margolis, R.L. et al. 
(1990) EMBO 9:4095-502). 
10 Microfilaments and Associated Proteins 
Actins 

Microfilaments, cytoskeletal filaments widi a diamet^ of about 7-9 nm, are vital to cell 
locomotion, cell shape, cell adhesion, cell division, and muscle contraction. Assembly and disassembly 
of the microfilaments allow cells to change their morphology. Microfilaments ace the polymerized 

15 form of actin, the most abundant intracellular protein in die eukaryotic cell. Human cells contain six 
isoforms of actin. The three a-actins are found in difC^nt kmds of muscle, nonmuscle |3-actin and 
nonmuscle y-actin are found in nonmuscle cells, and another y-actin is found in intestinal smooth 
muscle cells. G-actin, the monomeric form of actin, polymerizes into polarized, helical F-actin 
filaments, accompanied by the hydrolysis of ATP to ADR Actin filaments associate to form bundles 

20 and networks, providing a fiiamewoik to support the plasma membrane and determine cell shape. 
These bundles and networks are connected to the cell membrane, muscle cells, thin filaments 
containing actin slide past thick filaments containing the motor protein myosin during contraction. A 
family of actin-related proteins exist that are not part of the actin cytoskeleton, but rather associate 
with microtubules and dynein. 

25 Actin-Associated Proteinfs 

Actin-assodated proteins have roles in cross-linking, severing, and stabilization of actin 
filaments and in sequestering actin monomers. Several of the actm-associated protems have multiple 
functions. Bundles and networks of actin filaments are held together by actin cioss-linking proteins. 
These proteins have two actin-binding sites, one for each filament. Short cross-lnddng proteins 

30 promote bundle formation while longer, more flexible cross-linking proteins promote netw(»rk 
formation. Actin-interacting proteins (AIPs) participate in the regulation of actin filament 
organization. Other actin-associated proteins such as TARA, a novel F-actm binding piotem, fimction 
m a similar capacity by regulating actin cytoskeletal organization. Cahnodulin-like calcium-binding 
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domains in actin cross-linking proteins allow calcium regulation of cross-linking. Group I cross-linking 

proteins have unique actin-binding domains and include the 30 kD protein, EF-la, fascin, and scruin. 

Group n cross-linking proteins have a 7,000-MW actin-binding domain and include villin and dematin. 

Group in cross-linking proteins have pairs of a 26,000-MW actin-binding domain and include fimbrin, 
5 spectrin, dystrophin, ABP 120, and fflamin. 

Severing proteins regulate the length of actm filaments by breaking them into short pieces or 

by blocking their ends. Severing proteins include gCAP39, sevain (fragmm), gelsolin, and villin. 

Capping proteins can cap the ends of actin filaments, but cannot break filaments. Capping proteins 

include CapZ and tropomodulin. The proteins Ihymosin and profilm sequester actin monomers in the 
10 cytosol, allowing a pool of unpolymerized actin to exist. The actin-associated proteins tropomyosin, 

troponin, and caldesmon regulate muscle contraction in response to calcium. 

Microtubule and actin filament networks cooperate in processes such as vesicle and organelle 

transport, cleavage furrow placement, directed cell migration, spindle rotation, and nuclear migration. 

Microtubules and actm may coordinate to transport vesicles, organelles, and cell fate determinants, or 
IS transport may involve targeting and capture of microtubule ends at cortical actin sites. These 

cytoskeletal systems may be bridged by myosin-kinesin complexes, myosin-CLIP170 complexes, 

formin-homology (FH) proteins, dynein, the dynactin complex, Kar9p, coronin, ERM proteins, and 

kelch repeat-containing proteins (for a review, see Goode, B.L. et al. (2000) Curr. Opin. Cell Biol. 

12:63-71). The kelch repeat is a motif originally observed in the kelch protem, which is involved in 
20 formation of cytoplasmic bridges called ring canals. A variety of mflmmfliian and other kelch family 

proteins have been identified. The kelch repeat domain is believed to mediate interaction with actin 

(Robinson, D.N. and L. Cooley (1997) J. CeU Biol. 138:799-810). 

ADF/cofilms are a family of conserved 15-18 kDa actin-binding proteins that play a role in 

cytokinesis, endocytosis, and in development of embryonic tissues, as well as in tissue regeneration 
25 and in pathologies such as ischemia, oxidative or osmotic stress. LIM kinase 1 downregulates ADF 

(Carlier, M.F. et al. (1999) J. Biol. Chem. 274:33827-33830). 

LIM is an acronym of three transcription factors, Lin-U, Isl-1, and Mec-3, in which the motif 

was first id^tified. The LIM domain is a double zinc-finger motif that mediates the protein-protem 

interactions of transcription fact(»^, signaling, and cytoskeleton-associated proteins (Roof, D. J. et al. 
30 (1997) J. Cell BioL 138:575-588). These proteins are distributed in the nucleus, cytoplasm, or both 

(Brown, S. et al. (1999) J. Biol. CheuL 274:27083-27091). Recenfly, ALP (actinin-associated UM 

protein) has been shown to bind alpha-actinin-2 (Bouju, S. et al. (1999) Neuromuscul. Disord. 9:3-10). 
The Hrabfai protein is another example of an actin-filament binding protein (Obaishi, H. et al. 
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(1998) J. Biol. Chem. 273:18697-18700). Frabin QFGDl-relaled F-actin-feinding protein) possesses one 
actin-fQament binding (FAB) domain, one Dbl homology (DH) domain, two pleckstrin homology (PH) 
domains, and a single cysteine-rich FYVE ( Eablp, XOTB, Vaclp, and iEAl (early endosomal 
antigen 1)) domain. Ftabin has shown GDP/GTP exchange activity for Cdc42 small G protein 
5 (Cdc42), and mdiiectty induces activation of Rac small G protein (Rac) in intact cells. Througji the 
activation of Cdc42 and Rac, Frabin is able to induce fonnation of bofli filopodia- and lamellipodia-like 
processes (Ono, Y. et al. (2000) Oncogene 19:3050-3058). The Rho family small GTP-binding 
proteins are important regulators of actin-dependent cell functions includmg cell shape change, 
adhesion, and motility. The Rho family consists of three major subfamilies: Cdc42, Rac, and Rho. 

10 Rho family members cycle between GDP-bound mactive and GTP-bound active forms by means of a 
GDP/GTP exchange factor (GEF) (Umikawa, M. et al. (1999) J. Biol. Chem. 274:25197-25200). The 
Rho GEF family is crucial for microfilament organization. 
Intermediate Filaments and Associated Protdns 

Intennediate filaments (IFs) are cytoskeletal fibers with a diameter of about 10 nm, 

15 intermediate between that of microfilaments and microtubules. IFs serve structural roles in the cell, 
remforcing cells and organizing cells into tissues. IFs are particularly abundant m epidermal cells and 
in neurons. IFs are extremely stable, and, in contrast to microfilaments and microtubules, do not 
function in cell motility. 

Five types of IF proteins are known in mammals. Type I and Type n proteins are the acidic 

20 and basic keratins, respectively. Heterodimers of die acidic and basic keratins are tiie building blocks 
of keratin IFs. Keratins are abundant in soft epidielia such as skm and cornea, hard ^ithelia such as 
nails and hair, and in epitiielia tiiat line internal body cavities. Mutations in keratin genes lead to 
epithelial diseases including epidermolysis bullosa smq)lex, bullous congenital ichthyosiform 
erythroderma (epidennolytic hyperkeratosis), non-epidermolytic and epidermolytic palmoplantar 

25 keratoderma, ichthyosis bullosa of Siemens, pachyonychia congenita, and white spongp nevus. Some 
of fliese diseases result in severe skin blistering. (See, e.g., Wawersik, M. et al. (1997) J, Biol. Oiem. 
272:32557-32565; and Corden L.D. and W,H. McLean (1996) Exp. Dennatol. 5:297-307.) 

T^pe in IF proteins include desmin, glial fibrillary acidic protein, vimentin, and peripherin. 
Desmin filaments in muscle cells link myofibrils into bundles and stabilize sarcomeres in contracting 

30 muscle. Glial fibrillary acidic protein filaments axe found in the glial cells that surround neurons and 
astrocytes. Vimentin filaments are found in blood vessel endothelial cells, some epithelial cells, and 
mesenchymal cells such as fibroblasts, and are commonly associated with miciotubules. Vimentin 
filaments may have roles in keeping the nucleus and otfier organelles in place in flie cell. Type IV IFs 
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include the neurofilaments and nestin. Neurofilaments, composed of three polypeptides NF-L» NF-M, 

and NF-H, are frequently associated with microtubules m axons. NeuiojSlaments are responsible for 

the radial growth and diameter of an axon, and ultimately for the speed of nerve impulse transmission. 

Changes in phosphorylation and metabolism of neurofilaments are observed in neurodegenerative 
S diseases including amyotrophic lateral scl^sis, Parkinson's disease, and Alzheimer's disease (Julien, 

J.P. and WJE. MushynsM (1998) ftog. Nucleic Acid Res. Mol. Biol. 61:1-23). Type V IFs, the 

lamins, are foimd in the nucleus where they support the nuclear membrane. 

IPs have a central a-helical rod region interrupted by short nonhelical linker segments. The 

rod region is bracketed, in most cases, by non-helical head and tail domains. The rod regions of 
10 intermediate filament proteins associate to form a coiled*coil dim^. A highly ordered assembly 

process leads fiom the dimers to the IPs. Neither ATP nor GTP is needed for IF assembly, unlike 

that of microfilaments and microtubules. 

IF-associated proteins (LPAPs) mediate the interactions of IPs widi one another and with 

other cell structures. IFAPs cross-link IPs into a bundle, into a network, or to the plasnfia membrane, 
IS and may cross-link IPs to the microfilament and microtubule cytoskeleton. Microtubules and IPs are 

particularly closely associated. IPAPs include BPAOl, plakoglobin, desmoplakm I, desmoplakin n, 

plectin, ankyrin, filaggrin, and lamin B receptor. 

Cytoskdetal-Membrane Anchors 

Cytoskeletal fibers are attached to the plasma membrane by specific protems. These 
20 attachments are important for maintaining cell shape and for muscle contraction. In erythrocytes, die 

spectrin-actin cytoskeleton is attached to the cell membrane by three proteins, band 4.1, ankyria, and 

adducin. Defects in this attachment result in abnormally shaped cells which are more rapidly 

degraded by the spleen, leading to anemia. In platelets, the spectrin-actin cytoskeleton is also Imked to 

the mCTibrane by ankyrin; a second actm network is anchored to the membrane by filamin. Jn muscle 
25 cells the protein dystrophin links actm filaments to the plasma membrane; mutations in the dystrophin 

gene lead to Duchenne muscular dystrophy. 

Pocal adhesions 

Focal adhesions are specialized structures in the plasma m^brane involved in the adhesion of 
a cell to a substrate, such as the extracellular matrix. Pocal adhesions form the connection between 
30 an extracellular substrate and the cytoskeleton, and affect such functions as cell shape, cell motility 
and cell proliferation. Transmembrane integrin molecules form the basis of focal adhesions. Upon 
ligand binding, integrins clust^ in the plane of the plasma membrane. Cytoskeletal linker proteins such 
as the actin binding proteins a-actinin, talin, tensin, vinculin, paxillin, and filamin are recruited to the 
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clustering site. Key legulatoiy proteins, such as Rho and Ras family proteins* focal adhesion kinase, 
and Sic family members are also recruited. These events lead to the reorganization of actin filaments 
and die fonnation of stress fibers. These intracellular rearrangements promote furdier integrin-ECM 
interactions and integrin clustering. Thus, integrins mediate aggregation of protein complexes on both 

5 the cytosolic and extracellular faces of the plasma membrane, leading to the assembly of the focal 
adhesion. Many signal transduction responses are mediated via various adhesion complex proteins, 
including Src, FAK, paxillin, and tensin. (For a review, see Yamada, K.M. and B. Geiger, (1997) 
Curr. Opin, Cell Biol. 9:76-85.) 

IPs are also attached to membranes by cytoskeletal-membrane anchors. The nuclear lamma 

10 is attached to the mner surface of the nuclear membrane by die lamin B receptor. Vimentin IPs are 
attached to the plasma membrane by ankyiin and plectin. Desmosome and hemidesmosome 
membrane junctions hold together epithelial cells of organs and skin. These membrane junctions allow 
shear forces to be distributed across the entire epitiielial cell layer, thus providing strength and rigidity 
to the epitiielium. IPs in epithelial cells are attached to the desmosome by plakoglobin and 

IS desmoplaldns. Theproteins that link IPs to hemidesmosomes are not known. DesminlPssurroimd 
the sarcomere m muscle and are linked to the plasma membrane by paranemin, synemin, and ankyrin. 

The protein components of tight junctions include ZO-1 and ZO-2 (zona occludens), 
cytoplasmic proteins associated with the plasma membrane at tight junctions. ZO-I is a PDZ domain- 
containing protein which associates witii spectrin and thus may link tight junctions to die actin 

20 cytoskeleton. Other cytoplasmic components of tiglit junctions include cmgulin, 7H6 antigen, 

sympleWn, and small rab family OTPases. The first identified component of die tight junction strands, 
which form the actual junction between cells, was die integral membrane protein occludin, a 65 kD 
protein with four transmembrane domains. ZO-1 bmds to the carboxy-terminal region of occludin and 
may localize occludin to tiie tig^t junction. A recentiy id^tified family of proteins, the claudins, are 

25 also components of tight junction strands. 
Motor Proteins 
Mvosin-related Motor Proteins 

Myosins are actin-activaled AlPases, found in eukaryotic cells, that couple hydrolysis of ATP 
with motion. Myosin provides the motor function for muscle contraction and intracellular movements 

30 such as phagocytosis and rearrangement of cell contents during mitotic cell division (cytokinesis). The 
contractile unit of skeletal muscle, termed the sarcomere, consists of highly ordered arrays of thin 
actin-co^taining filaments and thick myosin-containmg filaments. Crossbridges form between ttie tiiick 
and thin filaments, and the ATP-dependent movement of myosin heads within the thick filaments pulls 
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the thin filaments, shortenmg die saicomeie and thus the muscle fiber. 

Myosins are composed of one or two heavy chains and associated light chains. Myosin heavy 
chains contain an amino-terminal motor or head domain, a neck that is the site of light-chain binding, 
and a carboxy-terminal tail domain. Hie tail domains may associate to form an a-helical coiled coil. 

5 Conventional myosins, such as those found in muscle tissue, are composed of two myosin heavy-chain 
subunits, each associated with two light-chain subunits that bind at the neck region and play a 
regulatory role. Unconventional myosins, believed to function in intracellular motion, may contain 
either one or two heavy chains and associated light chains. There is evidence for about 25 myosin 
heavy chain genes in vertebrates, more than half of them unconventional. 

10 Dvnein-related Motor Proteins 

Dyneins are (-) end-directed motor proteins which act on microtubules. Two classes of 
dyneins, cytosolic and axonemal, have been identified. Cytosolic dyneins are responsible far 
translocation of materials along cytoplasmic microtubules, for example, transport fiom the nerve 
terminal to the cell body and transport of endocytic vesicles to lysosomes. As well, viruses often take 

15 advantage of cytoplasmic dyneins to be transported to the nucleus and establish a successful infectibn 
(Sodeik, B. et al. (1997) J. Cell Biol. 136:1007-1021). Virion proteins of herpes simplex virus 1, for 
example, interact with the cytoplasmic dynein mtermediate chain (Ye, O.J. et al. (2000) J. Virol. 
74:1355-1363). Cytoplasmic dynems are alsp reported to play a role m mitosis. Axonemal dyneins are 
responsible for the beating of flageUa and cilia. Dynein on one microtubule doublet walks along the 

20 adjacent microtubule doublet. This sliding force produces bending that causes the flagcllum or cilium 
to beat. Dyneins have a native mass between 1000 and 2000 kDa and contain eidier two or three 
force-producing heads driven by die hydrolysis of ATP. The beads aie linked via stalks to a basal 
domain which is composed of a highly variable number of accessory mtermediate and light chains. 
Cytoplasmic dynein is the largest and most complex of the motor proteins. 

25 Kineggin-TPi lated Motor Prdteins 

IQnesins are (+) end-directed motor proteins which act on microtubules. The prototypical 
fcinesin molecule is involved in the transport of membrane-bound vesicles and organelles. This 
function is particularly important for axonal transport in neurons. Kmesui is also important in all cell 
types for the transport of vesicles from the Golgi complex to the radoplasmic reticulum. Ibis role is 

30- critical for maintaining the identity and functionality of these secretory organelles. 

Kinesins define a ubiquitous, conserved family of over 50 proteins that can be classified into at 
least 8 subfamilies based on primary amino acid sequence, domain stracture, velocity of movement, 
and cellular function. (Reviewed m Moore, J.D. and S.A. Endow (1996) Bioessays 18:207-219; and 
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Hoyt, A.M. (1994) Curr, Opin. Cell Biol. 6:63-68.) The prototypical kinesin molecule is a 
heterotetramer comprised of two heavy polypeptide chains (KHCs) and two li^t polypeptide chains 
(KLCs). The KHC subunits are typically referred to as "kine^n." KHC is about 1000 amino acids in 
length, and KLC is about SSO anuno acids in length. Two KHCs dimerize to form a rod-shaped 

5 molecule with three distinct regions of secondary structure. At one end of the molecule is a globular 
motor domain that functions in ATP hydrolysis and microtubule binding. Kinesin motor domains are 
highly conserved and share over 70% identity. Beyond the motor domain is an a-helical coiled-coil 
region which mediates dimerization. At the other end of the molecule is a fan-shaped tail that 
associates with molecular cargo. The tail is formed by the interaction of the KHC C-termini with the 

10 two KLjCs. 

Memb^ of the more divergent subfamilies of Idnesins are called kinesin-related proteins 
(KRPs), many of wUch fimction during mitosis in eukaryotes (Hoyt supra) . Some KRPs are 
lequued for assembly of the mitotic spindle. In vivo and in vitro analyses suggest that these KRPs 
exert force on microtubules that comprise the mitotic spindle, resulting in the separation of spindle 

15 poles. Phosphorylation of KRP is required for this activity. Failure to assemble the mitotic spmdle 
results in abortive mitosis and chromosomal aneuploidy, the latter condition being characteristic of 
cancer cells. In addition, a unique KRP» c^tromere protein E, localizes to tihe kinetochore of human 
mitotic chromosomes and may play a role in their segregation to opposite spindle poles. 
Dvnamin-related Motor Proteins 

20 Dynamin is a large GTPase motor protein that functions as a "molecular pinchase," generating 

a mechanochemical force used to sever membranes. This activity is important in forming clathrin*- 
coated vesicles firom coated pits in endocytosis and in the biogenesis of synaptic vesicles in neurons. 
Binding of dynamin to a membrane leads to dynamin's self-assembly into spirals diat may act to 
constrict a flat membrane surface into a tubule. GTP hydrolysis induces a change in conformation of 

25 the dynamin polymer that pinches the membrane tubule, leading to severing of the membrane tubule 
and formation of a membrane vesicle. Release of GDP and inor^uiic phosphate leads to dynamin 
disassembly. Following disassembly the dynamin may either dissociate from the membrane or remain 
associated to the vesicle and be transported to another region of the cell. Three homologous dynamin 
genes have been discovered, in addition to several dynamin-related proteins. Conserved dynamin 

30 regions are the N-tmninal GTP-binding domain, a central pleckstrin homology domain that binds 
membranes, a central coiled-coil region that may activate dynamin's GTPase activity, and a C- 
tenninal proline-rich domain that contains several moti£s that bind SH3 domains on other proteins. 
Some dynamin-related proteins do not contain the pleckstrin homology domain or the proline-rich 
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domain, (See McNiven, M.A. (1998) Cell 94:151-154; Scaife, R.M. and R.L. Margolis (1997) Cell. 
Signal. 9:395-401.) 

The cytoskeleton is reviewed in Lodish, H. et al. (1995) Molecular Cell Biology- Scientific 
American Books, New York NY. 
5 Nucleic Add-Assodated Proteins 

Multicellular organisms are comprised of diverse cell types that differ dramatically botii in 
structure and function. The identity of a cell is determined by its characteristic pattern of gene 
expression, and different cell types express overlapping but distinctive sets of genes tiuougjiout 
development Spatial and temporal regulation of gene expression is critical for die control of cell 
10 proliferation, cell differentiation, apoptosis, and otiier processes that contribute to organismal 
development Furthemiore, gene expression is legulated in response to extracellular signals that 
mediate cell-cell communication and coordinate the activities of different cell types. Appropriate gene 
regulation also ensures that cells function efficiently by expressing only those genes whose functions 
are required at a given time. 
15 Transcription Factors 

Transcriptional regulatoiy proteins are essential for the control of gene expression. Some of 
these protems function as transcription factors that initiate, activate, rqsress, or t^minate gene 
transcription. Transcription factors generally bind to the promoter, enhancer, and upstream regulatory 
regions of a g^ne m a sequence-specific manner, altiiough some factors bind regulatory elements 
20 within or downstream of a gene coding region. Transcription factors may bmd to a specific region of 
DNA singly or as a complex with oih&r accessory factors. (Reviewed in Lewin, B. (1990) Genes IV , 
Oxford University Press, New York, NY, and Cell Press, Cambridge, MA, pp. 554-570.) 

The double helix structure and rqpeated sequences of DNA create topological and chemical 
features which can be recognized by transcription fact^s. These features are hydrogen bond donor 
25 and acceptor groups, hydrophobic patches, major and minor grooves, and regular, repeated stretches 
of sequence which induce distinct bends in the helix. Typically, transcription factors recognize 
spedfic DNA sequence motifs of about 20 nucleotides in lengdi. Multiple, adjacent transcription 
factor-binding motifs may be required for gene regulation. 

Many transcription factors incorporate DNA-bindmg structural motifs which comprise either a 
30 helices or B sheets that bind to die major groove of DNA. Four weU-characterized structural motifs 
are helix-tum-helix, zinc finger, leucine zipper, and helix-loqp-helix. Proteins containing these motifs 
may act alone as monomers, or tiiey uoiay form homo* or heterodimers that interact with DNA. 

Hie helix-tum-helix motif consists of two a helices coimected at a fixed angle by a short chain 
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of amino adds. One of the helices binds to the major groove. Helix-tum-helix motifs are exemplified 
by the homeobox motif which is present in homeodomain proteins. These proteins are critical for 
specifying the anterior-posterior body axis duing development and are conserved throughout the 
animal kingdom. The Antennapedia and Ultrabithorax proteins of Drosophila inelanopaster are 
5 prototypical homeodomain proteins. (Pabo, CO. and R.T. Sauer (1992) Ann. Rev. Biochem. 
61:1053-1095.) 

The zinc finger motif, which binds zinc ions, generally contains tandem repeats of about 30 
amino acids consisting of periodically spaced cysteme and histidine residues. Examples of tiiis 
sequence pattern, designated C2H2 and C3HC4 CORING" finger), have been described. (Lewin, 

10 supra.) Zinc finger proteins each contain an a helix and an antiparallel fi sheet whose proximity and 
conformation are main t ained by the zinc ion. Contact with DNA is made by the arguiine preceding 
the a helix and by the second, third, and sixth residues of the a helix. Variants of the zinc fing^ motif 
include poorly defined cystdne-rich motifs which bind zinc or other metal ions. These motifs, may not 
contain histidine residues and are generally nonrepetitive. Tbe zinc finger motif may be repeated in a 

15 tandem array within a protein, such that llie a heUx of each zinc finger in the protein makes contact 
wifli the major groove of the DNA double helix. IMs repeated contact between the protein and the 
DNA produces a strong and specific DNA-protein interaction. The strength and specificity of flie 
mteraction can be regulated by die number of zinc finger motifs within the protein. Though originally 
idratified in DNA-binding protems as regions that mteract directly with DNA, zinc fing^ occur in a 

20 variety of proteins tiiat do not bind DNA (Lodish, H. et al. (1995) Molecnlar Cell Biologv. Scientific 
American Books, New York, NY, pp. 447-451). For example, Galcheva-Gargova, Z. et al, (1996) 
Science 272:1797-1802) have identified zinc finger proteins that interact with various cytokine 
receptors. 

The C2IE-type zmc finger signature motif contains a 28 amino acid sequence, including 2 
25 conserved Cys and 2 conserved His residues in a C-2-C-12-H-3-H type motif. The motif generally 
occurs in multiple tandem repeats. A cysteine-rich domain including flie motif Asp-His-His-Cys 
(DHHC-CRD) has been identified as a distinct subgroup of zmc finger proteins. The DHHC-CRD 
region has been implicated m growth and development One DHHC-CRD mutant shows defective 
function of Ras, a small membrane-associated GTP-binding protein that regulates cell growth and 
30 differentiation, while other DHHC-CRD proteins probably function in pathways not involving Ras 
(Bartels, D.J. et al. (1999) Mol. CeU Biol. 19:6775-6787). 

Zinc-finger transcription factors are often accompanied by modular sequraice motifs such as 
the Kruppel-associated box (KRAB) and die SCAN domain. For example, the 
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hypoalphalipoproteinemia susceptibility gene ZNF202 encodes a SCAN box and a KRAB domain 
followed by eight C2H2 zinc-finger motifs (Honer, C. et al. (2001) Biochim. Biophys. Acta 
1517:441-448). The SCAN domain is a hi^y conserved, leucine-rich motif of approximately 60 
amino acids found at the amino-terminal end of zinc finger transcription factors. SCAN domains are 
5 most often linked to CZH2 zinc finger motifs through thek carboxyl-terminal end. Biochemical binding 
studies have established the SCAN domain as a selective hetero- and homotypic oligomerization 
domain. SCAN domain-mediated protein complexes may function to modulate the biological function 
of transcription factors (Schumacher, C. et al., (2000) J. Biol. Chem. 275:17173-17179). 

Hie KRAB (Kmppel-associated box) domain is a conserved amino acid sequence spanning 

10 approximately 75 amino acids and is found in ahnost one-third of the 300 to 700 genes encoding C2H2 
zinc fingers. Ihe KRAB donoain is found N-terminally with respect to the fingo: repeats. The KRAB 
domain is generally encoded by two exons; the KRAB-A region or box is encoded by one exon and 
the KRAB-B region or box is encoded by a second exon. The function of the KRAB domain is the 
repression of transcription. Transcription repression is accomplished by recruitment of either the 

15 KRAB-associated protein-1, a transcriptional corqwessor, or the KRAB-A interacting protem. 
Proteins containing the KRAB domain are likely to play a regulatory role during development 
(Williams. A.J. et al., (1999) Mol. Cell BioL 19:8526-8535). A subgroup of highly related human 
KRAB zinc fing^ proteins detectable in all human tissues is highly expressed in human T lymphoid 
cells (BeUefroid, E.J. et al. (1993) EMBO J. 12:1363-1374). Hie ZNF85 KRAB zinc finger gene, a 

20 mCTiber of the humian ZNF91 family, is highly ejqwessed in normal adult testis, in seminomas, and in 
the NT2/D1 teratocarcinoma cell Ime (Poncelet, D.A. et al. (1998) DNA Cell Biol.l7:931-943). 

The C4 motif is found in hormone-regulated proteins. The C4 motif generally includes only 2 
repeats. A number of eukaryotic and vkal proteins contain a cons^ed cysteine-rich domain of 40 
to 60 residues (called C3HC4 zinc-finger or RING finger) fliat binds two atoms of zinc, and is 

25 probably involved in mediating protein-protein interactions. The 3D "cross-brace" structure of the 
zinc ligation system is unique to die RING domain. The spacing of the cysteines in such a domain is 
C-x(2)-C-x(9 to 39)-C-x(l to 3)-H-x(2 to3)-C-x(2)-C-x(4 to 48)-C-x(2>C. The PHD finger is a 
C4HC3 zinc-finger-like motif found in nuclear proteins ttiought to be involved in chromatin-mediated 
transcriptional regulation. 

30 GATA-type transcription factors contain one or two zinc finger domains which bind 

specifically to a region of DNA fliat contains the consecutive nucleotide sequence GATA. NMR 
studies indicate that ±e zinc finger comprises two irregular anti-parallel b sheets and an a helix, 
followed by a long loop to the C-terminal end of the finger (Ominchinski, J.G. (1993) Science 
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261:438-446). Hie helix and the loop connecting the two b-sheets contact the major groove of the 
DNA, while the C-tenninal part, which determines the specificity of binding, wraps aromid into the 
minor groove. 

The UM motif consists of about 60 amino acid residues and contains seven conserved 

5 cysteine residues and a histidine within a consensus sequence (Schmeichel, K.L. and Beckerle» M.C. 
(1994) Cell 79:211-219). The UM family includes transcription factors and cytoskeletal proteins 
which may be involved in development, diffioientiation, and cell growth. One example is actin-binding 
UM protein, which may play roles in regulation of the cytoskeleton and cellular moiphogenesis (Roof, 
D J. et al. (1997) J. Cell Biol. 138:575-588). The N-terminal domain of actin-bmding LIM protein has 

10 four double zmc finger motifs with the LIM consensus sequence. Hie C*tenmnal domain of actin- 
binding LIM protein shows sequence similarity to known actin-binding proteins such as dematin and 
villin. Actin-bindingLIMproteinbinds to F-actin through its dematin-likeC-teiminaldom^ The 
UM domain may mediate protein-protein interactions with other UM-binding proteins. 

Myeloid cell development is controlled by tissue-specific transcription factors. Myeloid zinc 

15 finger proteins (MZF) include MZF-1 and MZF-2. MZF-1 functions in regulation of the development 
of neutrophilic granulocytes. A murine homolog MZP-2 is expressed in myeloid cells, particularly in 
tfie cells committed to the neutrophilic lineage. MZF-2 is down-regulated by O-CSF and qypeais to 
have a unique function m neutrophil development (Murai, K. et al. (1997) Genes Cells 2:581-591). 

The leucine zipper motif conqnises a stretch of amino adds rich in leucine which can form an 

20 amphipathic a helix. This structure provides the basis for dimerization of two leucine zipper proteins. 
Hie region adjacent to die leucine zipp^ is usually basic, and upon protem dimerization, is optimally 
positioned for binding to the major groove. Proteins containmg such motifs are generally referred to as 
bZIP transcription factors. The leucine zipper motif is found in die proto-oncogenes Fos and Jun, 
which comprise the heterodimeric transcription factor API involved in cell growtii and the 

25 determination of cell lineage (Papavassiliou, A. G. (1995) N. Engl. J. Med. 332:45-47). 

The helix-loop-helix motif (HLH) consists of a short a helix connected by a loop to a longer a 
helix. The loop is flexible and allows the two heUces to fold back against each other and to bind to 
DNA. The transcription factor Myc contains a protolypical HLH motif. 

The NF-kappa-B/Rel signature defines a family of eukaryotic transcription factors involved in 

30 oncogenesis, embryonic development, differentiation and immune response. Most transcription factors 
containing the Rel homology domain (RHD) bind as dicners to a consensus DNA sequence motif 
termed kappa-B. Members of the Rel family share a highly conserved 300 amino acid domain termed 
the Rel homology domain. The characteristic Rel C-terminal domain is involved in gene activation and 
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cytoplasmic anchoring functions. Proteins known to contain the RHD domain include vertebrate 
nuclear factor NF-kappa-B, which is a heterodimer of a DNA-binding subunit and the transcription 
factor p65, mammalian transcription factra* RelB, and vertebrate proto-oncogene c-rcl, a protein 
associated with differentiation and lymphopoiesis (Kabnm, N., andEnrietto, PJ. (1994) Semin. 
5 Cancer Biol. 5:103-112). 

A DNA binding motif termed ARID (AT-rich interactive domain) distinguishes an 
evolutionarily conserved family of proteins. The approximately 100-residue ARID sequence is present 
in a series of proteins strongly implicated in the regulation of cell growth, development, and 
tissue-specific gene expression. ARID proteins include Bright (a regulator of B-cell-specific gene 

10 expression), dead ringer (involved in development), and MRF-2 (which represses expression from the 
cytomegalovirus enhancer) (Dallas. P.B. et al. (2000) MoL Cell Biol. 20:3137-3146). 

The ELM2 (Egl-27 and MTAl homology 2) domain is found in metastasis-associated protein 
MTAl and protein ERl. The Caenorhabditis ele|gans gene egl-27 is required for embryonic patterning 
MTAl, a human gene with elevated expression in metastatic carcinomas, is a component of a protein 

15 complex with histone deacetylase and nucleosome remodelling activities (Solan, F. et al. (1999) 
Development 126:2483-2494). TheELMQdomainisusually foundtotheNteiininusof amyb-like 
DNA bmding domain. ELM2 is also found associated with an ARID DNA. 

Most transcription factors contain characteristic DNA bmding motifs, and variations on the 
above motifs and new motifs have been and are currently being characterized. (Faisst, S. and S. 

20 Meyer (1992) Nucl. Acids Res. 20:3-26.) 
Chronaatin Assodatcd Proteins 

In the nucleus, DNA is packaged into chromatin, the compact organization of which limits die 
accessibility of DNA to transcription factors and plays a key role in gene regulation, ^win, supra, 
pp. 409-410.) The compact structure of chromatin is determined and influenced by chromatin- 

25 associated proteins such as the histones, the high mobility group (EIMG) proteins, and the 

chromodomain proteins. There are five classes of histones, HI, H2A, H2B, H3, and H4, all of which 
are highly basic, low molecular weight proteins. The fundamental unit of chromatin, tiie nucleosome, 
consists of 200 base pairs of DNA associated wifli two copies each of H2A, H2B, H3, and H4, HI 
links adjacent nucieosomes. HMG proteins are low molecular weight, non-histone proteins that may 

30 play a role in unwinding DNA and stabilizmg smgje-stranded DNA. Chromodomain proteins play a 
key role in the fonnation of highly compacted heterochromatin, which is transcriptionally silent 
Diseases and Disorders Related to Gene ReoMlafion 

Many neoplastic disorders in humans can be attributed to inappropriate gene expression. 
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Malignant cell growth may result from either excessive expression of tumor promoting genes or 
insufficient expression of tumor suppressor genes. (Qeary, MJL. (1992) Cancer Surv. 15:89-104.) 
The zinc finger-type transcriptional regulator WTl is a tumor-siqq>ressor protein that is inactivated in 
children with Wilm*s tumor. Hie oncogene bcl-6, which plays an important role in large-cell 

5 lymphoma, is also a zinc-finger protein (Papavassiliou, A. G. (1995) N. Engl. J. Med. 332:45-47). 
Chromosomal translocations may also produce chimeric loci that fuse the coding sequence of one 
gene with the regulatcny regions of a second unrelated gene. Such an arrangement likely results in 
inappropriate gene transcription, potentially contributing to malignancy. In Burldtt's lymphoma, for 
example, the transcription factor Myc is translocated to the immunoglobulin heavy chain locus, gready 

10 enhancing Myc expression and resulting m rapid cell growth leading to leukemia (Latchman, D. S. 
(1996) N. Engl. J. Med. 334:28-33). 

In addition, the immune system responds to infection or trauma by activating a cascade of 
events that coordinate the progressive selection, amplification, and mobilization of cellular defense 
mechanisms. A complex and balanced program of gene activation and repression is involved m this 

15 process. However, hyperactivity of the immune system as a result of improper or insufficient 

rej^ulation of gene egression may result in consid^able tissue or organ damage. This damage is well- 
documented in immunological responses associated with arthritis, allergens, heart attack, stroke, and 
infections. (Isselbacher et al. Harrison's Principles of Internal Medicine. 13/e, McGraw Hill, Inc. and 
Teton Data Systems Software, 1996.) The causative gene for autoimmune polyendocrinopathy- 

20 candidiasis-ectodermal dystrophy (AFECED) was recently isolated and found to encode a protein 
wifli two PHD-type zinc finger motifs (Bjorses, P. et al. (1998) Hum. Mol. Genet. 7:1547-1553). 

Furthermore, the generation of multicellular organisms is based upon the induction and 
coordination of cell differentiation at the appropriate stages of development Central to this process is 
differential gene expression, which confers the distinct identities of cells and tissues tiiroughout the 

25 body. Failure to regulate gene expression during development could result in developmental disorders. 
Human developmental disorders caused by mutations in zinc finger-type transcriptional regulators 
mclude: urogenital developmental abnormalities associated with WTl; Greig cephalopolysyndactyly, 
Pallister-Hall syndrome, and postaxial Polydactyly type A (OLD), and Townes-Biocks syndrome, 
characterized by anal, renal, limb, and ear abnormalities (SALLl) (Engelkamp, D. and V. van 

30 Heyningen (1996) Curr. Opin. Genet. Dev. 6:334-342; Kohlhase, J. et al. (1999) Am. J. Hum. Genet 
64:435-445). 

SYNTHESIS OF NUCLEIC ACIDS 
Polymerases 
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DNA and RNA leplication are critical processes for cell replication and function. DNA and 
RNA rq)lication are mediated by the enzymes DNA and RNA polymerase, respectively, by a 
'*templating" process in which the nucleotide sequence of a DNA or RNA strand is copied by 
complementary base-pairing into a complementaiy nucleic acid sequence of either DNA or RNA. 
5 However, there are fundamental differences between the two processes. 

DNA polymerase catalyzes the stepwise addition of a deoxyribonucleotide to the 3 -OH end 
of a polynucleotide strand (the primer strand) that is paired to a second (template) strand. The new 
DNA strand th^efore grows in the 5' to 3' direction (Alberts, B. et al. (1994) The Molecular Bioloffv 
of the Cell . Garland Publishing fiic. New York, NY, pp 251-254). Ibe substrates for the 

10 polymerization reaction are the corresponding deoxynucleotide triphosphates which must base-pair 
with die correct nucleotide on die template strand in order to be recognized by the polymerase. 
Because DNA exists as a double-stranded helix, each of the two strands may serve as a template for 
the formation of a new complementary strand. Each of tiie two daughter cells of a dividing cell 
therefore inherits a new DNA double helix containing one old and one new strand. Thus, DNA is said 

15 to be replicated "semiconservatively" by DNA polymerase. In addition to die synfliesis of new DNA, 
DNA polymerase is also involved in flie repair of damaged DNA as discussed below under 'ligases." 

Jn contrast to DNA polymerase, RNA polymerase uses a DNA template strand to 
"transcribe" DNA into RNA using ribonucleotide triphosphates as substrates. Like DNA 
polymerization, RNA polymerization proceeds in a 5* to 3* direction by addition of a ribonucleoside 

20 monophosphate to flie 3'-OH end of a growing RNA chain. DNA transcription generates messenger 
RNAs (mRNA) that cany information for protein synfliesis, as well as flie transfer, ribosomal, and 
otiier RNAs fliat have stractural or catalytic functions. In eukaryotes, three discrete RNA 
polymerases synfliesize die fliree different types of RNA (Alberts et al., supra pp. 367-368). RNA 
polymerase I makes die large ribosomal RNAs, RNA polymerase n makes the mRNAs that will be 

25 translated into proteins, and RNA polymerase TH makes a variety of small, stable RNAs, includmg 5S 
ribosomal RNA and tfie transfer RNAs (tRNA). In all cases, RNA synthesis is initiated by binding of 
the RNA polymerase to a promoter region on the DNA and synthesis begins at a start site within die 
promoter. Synfliesis is completed at a stop (tennination) signal in flie DNA whereupon bofli the 
polymerase and the completed RNA chain are released. 

30 Meases 

DNA repair is flie process by which accidental base changes, such as those produced by 
oxidative damage, hydrolytic attack, or uncontrolled mefliylation of DNA, are conected before 
replication ot transcription of the DNA can occur. Because of the efficiency of flie DNA repair 
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process, fewer than one in a thousand accidental base changes causes a cautation (Alberts et al., 
supra pp. 245-249). The three steps conunon to most types of DNA repair are (1) excision of the 
damaged or altered base or nucleotide by DNA nucleases, (2) insertion of the conect nucleotide in the 
gap left by the excised nucleotide by DNA polymerase using the complementary strand as the 

5 template and, (3) sealing the break left between the inserted nucleotide(s) and the existing DNA 
strand by DNA ligase. In the last reaction, DNA ligase uses the energy from ATP hydrolysis to 
activate the S' end of die broken phosphodiester bond before forming the new bond with di&S -OH of 
the DNA strand. In Bloom's syndrome, an inherited human disease, individuals are partially deficient 
in DNA ligation and consequendy have an mcreased incidence of cancer (Alberts et al., supra p. 247). 

10 Nucleases 

Nucleases comprise enzymes that hydrolyze both DNA CDNase) and RNA (Rnase). They 
serve different purposes in nucleic acid metabolism. Nucleases hydrolyze the phosphodiester bonds 
between adjacent nucleotides eitiier at mtemal positions (endonucleases) or at the terminal 3' or S' 
nucleotide positions (exonucleases). A DNA exonuclease activity in DNA polymerase, fOT example, 

15 serves to remove improperly paked nucleotides attached to flie 3 -OH end of die growing DNA strand 
by the polynimse and tiiereby serves a "proofreading'' function. As mentioned above, DNA 
endonuclease activity is involved in tiie excision step of the DNA repair process. 

RNases also serve a variety of functions. For example, RNase P is a ribonucleoprotein 
enzyme which cleaves tiie 5' end of pre-tRNAs as part of their maturation process. RNase H digests . 

20 die RNA strand of an RNA/DNA hybrid. Such hybrids occur in cells invaded by retroviruses, and 
KNase H is an unpcntant enzyme in the retroviral replication cycle. Pancreatic RNase secreted by 
the pancreas mto die intestine hydrolyzes RNA present in ingested foods. RNase activi^ in serum 
and cell extracts is elevated in a variety of cancers and infectious diseases (Schein, C.H. (1997) Nat. 
Biotechnol. 15:529-536). Regulation of RNase activity is being investigated as a means to control 

25 tumor angiogenesis, allergic reactions, viral infection and replication, and fungal infections. 
MODIFICATION OF NUCLEIC ACIDS 
MetfaYlases 

Methylation of specific nucleotides occurs in botii DNA and RNA, and serves different 
functions in the two macromolecules. Methylation of cytosine residues to foam 5-mefliyl cytosine m 
30 DNA occurs specifically in CG sequences which are base-paired with one another in the DNA 
double-helix. The pattern of methylation is passed from generation to generation during DNA 
replication by an enzyme called '•maintenance methylase" that acts preferentially on those CG 
sequences that are base-paired with a CG sequence that is aheady methylated. Such methylation 
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appears to distinguish active fiom inactive genes by preventing the binding of regulatory proteins tfiat 
"turn on" the gene, but peimiting the binding of proteins that inactivate the gene (Alberts et al. supra 
pp. 448-451). In RNA metabolism, "tRNA mefliylase" produces one of several nucleotide 
modifications in tRNA tiiat affect the conformation and base-pairing of tiie molecule and facilitate tiie 
5 recognition of tiie 85>propiiate mRNA codons by q)ecific tRNAs. Hie primary metiiylation pattern is 
the dimediylation of guanine residues to form N,N-dimethyl guanine. 
Helicases and Siagle«stranded Binding Prfttcins 

Helicases are enzymes that destabilize and unwind double helix structures in both DNA and 
RNA. Since DNA replication occurs more or less simultaneously on both strands, the two strands 

10 must first separate to generate a replication "fork" for DNA polymerase to act on. Two types of 
replication proteins contribute to this process, DNA helicases and single-stranded binding proteins. 
DNA helicases hydrolyze ATP and use the energy of hydrolysis to separate tiie DNA strands. 
Single-stranded binding proteins (SSBs) tiien bind to tiie exposed DNA strands, without covering the 
bases, tiiereby temporarily stabilizing diem for templating by the DNA polymerase (Alberts et al. 

15 su^ pp. 255-256). 

RNA helicases also alter and regulate RNA conformation and secondary stmcture. Like tiie 
DNA helicases, RNA helicases utilize energy derived ftom ATP hydrolysis to destabilize and unwind 
RNA duplexes. The most well-characterized and ubiquitous family of RNA helicases is die DEAD- 
box family, so named for the conserved B-type ATP-bmding motif which is diagnostic of proteins in 

20 this family. Over 40 DEAD-box helicases have been identified in organisms as diverse as bacteria, 
insects, yeast, amphibians, mammals, and plants. DEAD-box helicases function in diverse processes 
such as translation initiation, splicing, ribo'some assembly, and RNA editing, transport, and stability. 
Examples of tiiese RNA helicases include yeast Drsl protein, which is involved in ribosomal RNA 
processing; yeast TEPl and TIF2 and mammalian eIF-4A, which are essential to the mitiation of RNA 

25 translation; and human p68 antigen, which regulates cell growtii and division (Ripmaster, T.L. et al. 
(1992) Proc. Nati. Acad. Sci. USA 89:11131-11135; Chang, T.-H. et al. (1990) Proc. Nati. Acad Sd. 
USA 87:1571-1575). These RNA helicases demonstrate strong sequence homology over a stretch of 
some 420 amino acids. Included among these conserved sequences are the consensus sequence for 
the A motif of an ATP binding protein; die "DEAD box" sequence, associated wifli ATPase activity; 

30 the sequence SAT, associated with the actual helicase unwinding region; and an octapeptide 

consensus sequence, required for RNA binding and ATP hydrolysis (Pause, A, et al. (1993) Mol. Cell 
Biol. 13:6789-6798). Differences outside of tiiese conserved regions are believed to reflect 
differences m tiie functional roles of individual proteins (Chang, TJH. et al. (1990) Proc. Nati. Acad, 
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Sci. USA 87:1571-1575). 

Some DEAD-box helicases play tissue- and stage-specific roles in spennatogenesis and 

embryogenesis. Overexpression of tbe DEAD-box 1 protein (DDXl) may play a role in the 

progression of neuroblastoma (Nb) and retinoblastoma (Rb) tumors (Godbout, R. et al. (1998) J. Biol. 
5 Chem. 273:21161-21168). These observations suggest that DDXl may promote or enhance tumor 

progression by altering the normal secondary structure and expression levels of RNA in cancer cells. 

Other DEAD-box helicases have been implicated either direcfly or indirectly in tumorigenesis. 

(Discussed in Godbout, supra .) For example, murine p68 is mutated in ultraviolet light-induced tumors, 

and human DDX6 is located at a chromosomal breakpoint associated with B-cell lymphoma. 
10 Similarly, a chimeric protein comprised of DDXIO and NUP98, a nucleoporin protein, may be involved 

in the patfiogenesis of certain myeloid malignancies. 

Topoisomerases 

Besides the need to sq)arate DNA strands prior to replication, the two strands must be 
•^unwound" ftom one another prior to then: separation by DNA helicases. This function is performed 

15 by proteins known as DNA topoisomerases. DNA topoisomerase effectively acts as a reversible 
nuclease that hydrolyzes a phosphodiesterase bond in a DNA strand, permits tiie two strands to rotate 
freely about one another to remove flie strain of tiie helix, and flien rejoins the original phosphodiester 
bond between the two strands. Topoisomerases are essential enzymes responsible for the topological 
rearrangement of DNA brought about by transcription, replication, chromatin formation, 

20 recombination, and chromosome segregation. Superhelical coils are introduced into DNA by (he 
passage of processive enzymes such as RNA polymerase, or by the separation of DNA strands by a 
helicase prior to replication. Knotting and concatenation can occur in the process of DNA syntiiesis, 
storage, and repair. All topoisomerases work by breaking a phosphodiester bond in the ribose- 
phosphate backbone of DNA. A catalytic tyrosine residue on the ^izyme makes a nuclepphilic attack 

25 on the scissile phosphodiester bond, resulting in a reaction intermediate in which a covalent bond is 
formed between the enzyme and one end of the broken strand. A tyrosine-DNA phosphodiesterase 
functions in DNA repair by hydrolyzing tills bond in occasional dead-end topoisomerase I-DNA 
intermediates (Pouliot, J.J. et al. (1999) Science 286:552-555), 

Two types of DNA topoisomerase exist, types I and II. T^pe I topoisomerases work as 

30 monomers, making a break m a single strand of DNA while typQ U topoisomerases, working as 
homodimers, cleave both strands. DNA Topoisomerase I causes a single-strand break in a DNA 
helix to allow tiie rotation of tiie two strands of the helix about the remaining phosphodiester bond in 
die opposite strand. DNA topoisomerase n causes a transient break in bofli strands of a DNA helix 
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where two double helices cross over one another. This type of topoisomerase can efficiently separate 
two mterlocked DNA circles (Alberts et al. supra pp.260-262). l^pe II topoisomerases aie largely 
confined to proliferating cells in eukaryotes, such as cancer cells. For this reason they are targets for 
anticancer drugs. Topoisomerase n has been unplicated in multi-drug resistance (MDR) as it appears 
5 to aid in the repah: of DNA damage inflicted by DNA bmding agents such as doxorubicin and 
vincristine. 

The topoisomerase I family includes topoisomerases I and m (topo I and topo m^. The 
crystal structure of human topoisomerase I suggests that rotation about the intact DNA strand is 
partially controlled by the enzyme. In this "controlled rotation" model, protein-DNA interactions limit 

10 the rotation, which is driven by torsional strain m tiie DNA (Stewart, L. et al. (1998) Science 

379: 1534-1541). Structurally, topo I can be recognized by its catalytic tyrosine residue and a numb^ 
of other conserved residues in the active site region. Topo I is thought to function during transcription. 
Two topo His are known in humans, and they are homologous to prokaryotic topoisomerase I, with a 
conserved tyrosine and active site signature specific to tiiis family. Topo m has been suggested to 

IS play a role in meiotic recombination. A mouse topo in is highly expressed in testis tissue and its 
expression increases wifli die increase in the number of cells in pachytene (Seki, T. et al. (1998) J, 
Biol. Chem. 273:28553-28556). 

Hie topoisomerase n family includes two isozymes (Ila and Kb) encoded by different genes. 
Topo n cleaves double stranded DNA in a reproducible, nonrandom fashion, preferentially in an AT 

20 rich region, but the basis of cleavage site selectivity is not known. Structurally, topo n is made up of 
four domains, the fibrst two of which are structurally sunilar and probably distantiy homologous to 
similar domains in eukaryotic topo I Hie second domain bears the catalytic tyrosine, as well as a 
hi^y conserved pentapeptide. The Ila isoform appears to be responsible for unlinking DNA during 
chromosome segregation. Cell lines expressing Ila but not Ub suggest that lib is dispensable in 

25 cellular processes; however, lib knockout mice died peiinataily due to a failure m neural development 
That the major abnormalities occurred in predominantiy late developmental events (neurogenesis) 
suggests tiiat nb is needed not at mitosis, but rather during DNA repair (Yang, X. et al. (2000) 
Science 287:131-134). 

Topoisomerases have been implicated in a number of disease states, and topoisomerase 

30 poisons have proven to be effective anti-tumor drugs for some human malignancies. Topo I is 
mislocalized in Fanconi's anemia, and may be involved in the chromosomal breakage seen in diis 
disorder (Wunder, E. (1984) Hubl Genet 68:276-281). Overexpression of a truncated topo HI in 
ataxia-telangiectasia (A-T) cells partially suppresses the A-T phenotype, probably througih a dominant 
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negative mechanism. This suggests that topo HI is deregulated in A-T (Fritz, E. et al. (1997) Plroc. 
Natl. Acad. Sci. USA 94:4538-4542). Topo HI also interacts witihi the Bloom's Syndrome gene 
product, and has been suggested to have a role as a tumor suppressor (Wu, L. et al. (2000) J. Biol. 
Caiem. 275:9636-9644). Aberrant topo II activity is often associated with cancer or increased cancer 
5 risk. Greatly lowered topo II activity has been found in some, but not all A-T cell lines (Mohamed, R. 
et al. (1987) Biochem. Biophys. Res. Conmiun. 149:233-238). On the other hand, topo n can break 
DNA in the region of the A-T gene (ATM), which controls all DNA damage-responsive cell cycle 
checkpoints (JCaufmann, W.K. (1998) Rroc. Soc. Exp. Biol. Med. 217:327-334). The abiUty of 
topoisomerases to break DNA has been used as the basis of antitumor drugs. Topoisomerase poisons 

10 act by increasing the number of dead-end covalent DNA-enzyme complexes in the cell, ultimately 
triggering cell death pathways (Fortune, J.M. and N. Osheroff (2000) Ptog. Nucleic Acid Res. Mol. 
Biol. 64:221-253; Guichard, S.M. and M.K. Danks (1999) Curr. Opin. Oncol. 11:482-489). Antibodies 
agamst topo I are found in the s^um of systemic sclerosis patients, and the levels of the antibody may 
be used as a marker of puhnonaiy involvement in the disease (Diot, E. et al. (1999) Chest 1 16:715- 

15 720). Rnally, the DNA bmding region of human topo I has been used as a DNA delivery vehicle for 
gene flierapy (Chen, T.Y. etal. (2000) Appl. Microbiol. Biotechnol. 53:558-567). 
Recombinases 

Genetic recombination is the process of rearranging DNA sequences within an organism's 
genome to provide genetic variation for the organism in response to changes in the environment. 

20 DNA recombmation allows variation in the particular combination of genes present in an mdividual' s 
genome, as well as the timing and level of expression of these genes. (See Alb^s et al. supra pp. 
263-273.) Two broad classes of genetic recombination are commonly recognized, general 
recombination and site-specific recombination. General recombination involves genetic exchange 
between any homologous pair of DNA sequences usually located on two copies of the same 

25 chromosome. The process is aided by enzymes, recombinases, that *'nick" one strand of a DNA 
duplex more or less randomly and permit exchange with a complementary strand on another duplex. 
The process does not normally change Ae arrangraient of genes in a chromosome. In site-specific 
recombination, the recombinase recognizes specific nucleotide sequences present in one or both of the 
recombining molecules. Base-paking is not involved m this form of recombination and therefore it 

30 does not requue DNA homology between the recombining molecules. Unlike gwieral recombination, 
this form of recombination can alter the relative positions of nucleotide sequences in chromosomes. 
BNA METABOLISM 

Ribonucleic acid (RNA) is a linear single-stranded polymer of four nucleotides, ATP, CTP, 
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UTP, and GTP, In most organisms, RNA is transcribed as a copy of deoxyribonucleic acid (DNA), 
fhe genetic material of the organism. In retroviruses RNA rather than DNA serves as the genetic 
material. RNA copies of the genetic material encode proteins or serve various stmctural, catalytic, or 
regulatory roles in organismis. RNA is classified according to its cellular localization and function. 
5 Messenger RNAs (mRNAs) encode polypeptides. Ribosomal RNAs (rRNAs) are assembled, along 
with riboscmial proteins, into ribosomes, which are cytoplasmic particles that translate mRNA into 
polypeptides. Transfer RNAs (iRNAs) are cytosolic adaptor molecules that function in mRNA 
translation by recognizing both an mRNA codon and the amino acid that matches that codon. 
Het^ogeneous nuclear RNAs (hnRNAs) include mRNA precursors and other nuclear RNAs of 

10 various sizes. Small nuclear RNAs (snRNAs) are a part of the nuclear spUceosome complex that 
removes intervening, non^^oding sequences (introns) and rejoms exons in pre-mRNAs. 

Proteins are associated with RNA durmg its transcription from DNA, RNA processing, and 
translation of mRNA into protein. Proteins are also associated with RNA as it is used for structural, 
catalytic, and regulatoiy purposes. 

15 R^AFyocessing 

Ribosomal RNAs (rRNAs) are assembled, along with ribosomal proteins, into ribosomes, 
which are cytoplasmic particles that translate messenger RNA (mRNA) into polypeptides. The 
eukaryotic ribosome is composed of a 60S (large) subunit and a 40S (small) subunit, which together 
form the SOS ribosome. In addition to the 18S, 28S, SS, and S.8S rRNAs, ribosomes contain from SO 

20 to over 80 different ribosomal proteins, dependmg on the organisnoL Ribosomal proteins are classified 
according to which subunit they belong (i,e., L, if associated widi the large 60S large subunit or S if 
associated with the small 40S subunit). B. coli ribosomes have been the most thc^oughly studied and 
contain 50 proteins, many of which are conserved in all life f onns. The structures of nine ribosomal 
proteins have been solved to less than 3.0D resolution (i.e., S5, S6, S17, LI, L6, L9, L12, L14, L30), 

25 revealing conunon motifs, such as b-a- b protein folds in addition to acidic and basic RNA-binding 
motifs positioned between b*strands. Most ribosomal proteins are believed to contact rRNA directly 
(reviewed m liljas, A. and Oarber, M. (1995) Cuir. Opm. Struct Biol. 5:721-727; see also Woodson, 
S.A. and Leontis, N.B. (1998) Cuir. Opin. Struct. Biol. 8:294-300; Ramakrishnan, V. and White, S.W. 
(1998) Trends BiocheuL Sci. 23:208-212). 

30 Ribosomal proteins may undergo post-translational modifications or interact with other 

ribosome-associated proteins to regulate translation. For example, the highly homologous 40S 
ribosomal protein S6 kinases (S6K1 and S6K2) play a key role in the regulation of ceU growth by 
controlling die biosynthesis of translational components which make up the protein syndietic apparatus 
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(including the ribosomal proteins). In the case of S6K1, at least eight phosphorylation sites are 
believed to mediate kinase activation in a hierarchical fashion (Dufiier and ThomasX1999) Exp. Cell 
Res. 253:100-109). Some of the ribosomal protems, including LI, also function as translational 
repressors by binding to polycistronic mKNAs encoding ribosomal proteins (reviewed in liljas, A. 
S supra and Garber, M. supra') . 

Recent evidence suggests that a number of ribosomal proteins have secondary functions 
independent of their involvement in protein biosynthesis. These proteins function as regulators of cell 
proliferation and, in some instances, as inducers of cell death. For exanq)le« the expression of human 
ribosomal protein L13a has been shown to induce apoptosis by arresting ceU growth in the G2/M 

10 phase of the cell cycle. Inhibition of expression of L13a induces q)optosis in target cells, which 

suggests that this protein is necessary, in the appropriate amount, for cell survival. Similar results have 
been obtamed in yeast where inactivation of yeast homologues of L13a, rp22 and rp23, results in 
severe growth retardation and deadi. A closely related ribosomal protein, L7, arrests cells in Gl and 
also induces apoptosis. Thus, it appears that a subset of ribosomal proteins may function as cell cycle 

15 checlq)oints and compose a new family of cell proliferation regulators. 

Mapping of individual ribosomal proteins on the surface of intact ribosomes is accomplished 
using 3D immunocryoelectronmicroscopy, wheceby antibodies raised against specific ribosomal 
protems are visualized. Progress has been made toward the mappmg of LI, L7, and L12 while the 
structure of the intact ribosome has been solved to only 20-2SD resolution and inconsistencies exist 

20 among different crude structures (Frank, J. (1997) Curr. Opin. Struct. Biol. 7:266-272). 

Three distinct sites have been idratified on the ribosome. The aminoacyl-tRNA accq)tor site 
(A site) receives charged tRNAs (widi the exception of the initiator-tRNA). The peptidyl-tRNA site 
(P site) binds the nascent polypeptide as the amino acid firam the A site is added to flie elongating 
chain. Deacylated tRNAs bind in the exit site (E site) prior to their release from the ribosome. The 

25 structure of flie ribosome is reviewed in Stiyer, L. (1995) Biochemistry W.H. Freeman and Company, 
New York NY pp. 888-9081; Lodish, H. et al. (1995) Molecular CeU Biology Scientific American 
Books, New York NY pp. 119-138; and Lewin, B (1997) Genes VI Oxford University Press, Inc. 
New York, NY). 

Various proteins are necessary for processing of transcribed RNAs in the nucleus. Pre- 
30 mRNA processing steps include capping at the 5' end with metfaylguanosine, polyadenylating the 3* 
end, and splicing to remove introns. The primary RNA transript from DNA is a faithful copy of die 
gene containing both exon and intron sequences, and the latter sequences must be cut out of the RNA 
transcript to produce a mRNA that codes for a protein. This "splicing" of the mRNA sequence takes 
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place in the nucleus with the aid of a large, muMcomponent ribonucleoprotein conq)iex known as a 
spliceosome. The spliceosomal complex is comprised of five small nuclear ribonucleoprotdn particles 
(snRNPs) designated Ul, U2, U4, U5, and U6. Each snRNP contains a single species of snRNA and 
about ten proteins. The RNA components of some snRNPs recognize and base-pair witii intron 

5 consensus sequences. The protein components mediate spliceosome assembly and the splicing 
reaction. Autoantibodies to snRNP proteins are found in the blood of patients with systemic lupus 
erythematosus (Sttyer, L. (1995) Biochemistry W JI. Freeman and Company, New York NY, p. 863). 

Several splicing regulatory proteins have been identified ia Drosophila . Human (HsSWAP) 
and mouse (MmSWAP) homologs of the suppressor-of-white-apricot (su(wa)) g^e have been cloned 

10 and characterized. HsSWAP and MmSWAP both have five highly homologous regions to su(wa), 
including an arginine/serine-rich domain and two xepeated modules diat are homologous to regions in 
the constitutive splicing factor, SPP91/PRP21. Mammalian SWAP mRNAs are alternatively spliced 
at the same splice sites as in Drpsophila . The splice junctions of the Drosophila and mammalian 
regulated introns are conserved Thus, research suggests that the manunalian SWAP gene functions 

15 as a vertebrate altraiative splicing regulator (Denhez, F. and Lafyatis, R. (1994) Biol. Chem. 
269;16170-16179). 

Serine- and argmine-rich pre-mRNA splicing factors (SR proteins) are phosphorylated before 
tiiey regulate splicing events. SRip86 (SR-related protein of 86 kDa) is a novel SR protem containing 
a single amino-terminal RNA recognition motif and two carboxy-teiminal domains rich in serine- 

20 arginine (SR) dipeptides. SRrp86 activates splicing in the presence of SRp20. However, it inhibits tiie 
in vitto and ia vi^o activation of specific splice sites by SR protems, including ASF/SF2, SC35, and 
SRp55. Research suggests that pairwise combination of SRrp86 with specific SR proteins leads to 
altered splicing efficiency and differential splice site selection ^Barnard, D.C. and Patton, J.G. (2000) 
Mol. Cell, Biol. 20:3049-3057). 

25 Heterogeneous nuclear ribonucleoproteins (hnRNPs) have been identified that have roles in 

splicing, exporting of die mature RNAs to the cytoplasm, and mRNA translation (Biamonti, O. et al. 
(1998) Clin. Exp. Rheumatol. 16:317-326). Some examples of hnRNPs include die yeast proteins 
Hiplp, involved in cleavage and polyadenylation at the 3* end of the RNA; Cbp80p, involved in 
capping the 5* end of the RNA; and Npl3p, a homolog of mammalian hnRNP Al, mvolved m export of 

30 mRNA firom Ae nucleus (Shen, E.C. et al. (1998) Genes Dev. 12:679-691). HnRNPs have been 
shown to be 

important targets of the autoimmune response in rheumatic diseases (Biamonti, supra\ 

Many snRNP and hnRNP proteins are characterized by an RNA recognition motif (RRM). 
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(Reviewed in Bimey, E. et al. (1993) Nucleic Acids Res. 21:5803-5816.) The RRM is about 80 amino 
acids in lengtli and forms four b-strands and two a-helices arranged in an a /b sandwich. The RRM 
contains a core RNP-1 octapeptide motif along with surrounding conserved sequences. In addition to 
snRNP proteins, examples of RNA-binding proteins which contam the above motifs include 
5 heteronuclear ribonucleoproteins which stabilize nascent RNA and factors which regulate alternative 
splicing. Altemative splicing factors include developmentally regulated proteins, specific examples of 
which have been identified in lower eukaryotes such as Drosophflia melanogaster and Caenorhabditis 
eleeans. These proteins play key roles in developmental processes such as pattern formation and sex 
determination, respectively. (See, for example, Hodgkin, J. et al. (1994) Development 120:3681- 
10 3689.) 

The 3' ends of most eukaiyote mRNAs are also posttranscriptionally modified by 
polyadenylation. Polyadenylation proceeds through two enzymatically distinct steps: (i) the 
endonucleolytic cleavage of nascent mRNAs at cu-acting polyadenylation signals in the 
3 -untranslated (non-coding) region and (ii) the addition of a poly(A) tract to die 5* mRNA fragment 

15 Hie presence of ci^-acting RNA sequences is necessary for both steps. These sequences include 5 - 
AAUAAA-3' located 10-30 nucleotides upstream of die cleavage site and a less well-conserved GU- 
or U-rich sequence element located 10-30 nucleotides downstream of the cleavage site. Cleavage 
stimulation factor (CstF), cleavage factor I (CP I), and cleavage factor n (CF D) are involved in the 
cleavage reaction while cleavage and polyadenylation specificity factor (CPSF) and poly(A) 

20 polymerase (PAP) are necessary for both cleavage and polyadenylation. An additional enzyme, 
poly(A)-binding protein n (PAB II), promotes poly(A) tract elongation (Rttegsegger, U. et al. (1996) 
J. Biol. Chem. 271:6107-6113; and references within). 
TRANSLATION 

Correct translation of the genetic code depends upon each amino acid forming a linkage with 
25 die appropriate transfer RNA (tRNA), The aminoacyl-tRNA synthetases (aaRSs) are essential , 
proteins found in all living organisms. The aaRSs are responsible for die activation and conect 
attachment of an amino acid with its cognate tRNA, as the first step in protein tnosyndiesis. 
Prokaryotic organisms have at least twenty different types of aaRSs, one for each different amino 
acid, while eukaryotes usually have two aaRSs, a cytosolic form and a mitochondrial form, for each 
30 different amino acid. The 20 aaRS enzymes can be divided into two structural classes. Class I 

enzymes add amino acids to the 2' hydroxyl at the 3' end of tRNAs while Class n enzymes add amino 
acids to the 3* hydroxyl at die 3' end of tRNAs. Each class is characterized by a distinctive topology 
of die catalytic domain. Class I enzymes contain a catalytic domain based on the nucleotide-blnding 
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Rossman 'fold'. In particular, a consensus tetrapeptide motif is highly conserved (Piosite Document 
PDOC00161, Aminoacyl-transfer RNA synthetases class-I signature). Gass I enzymes are specific 
for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, 
and valine. Class n enzymes contain a central catalytic domain, which consists of a seven-stranded 

5 antiparallel B-sbeet domain, as well as N- and C- terminal regulatory domains. Class n enzymes are 
separated into two groups based on the heterodimeric or bomodimeric stracture of the enzyme; the 
latter group is further subdivided by the structure of the N- and C-tenninal regulatory domains 
(Harttein, M. and Cusack, S. (1995) J. Mol. Bvol. 40:519-530). Class II enzymes are specific for 
alanine, asparagine, aspartic acid, i^ycine, histidine, lysine, phenylalanine, proline, serine, and threonine. 

10 Certain aaRSs also have editing functions. OeRS, for example, can misactivate valine to form 

Val-tRNA°% but this product is cleared by a hydroly tic activity that destroys the mischarged product. 
This editing activity is located within a second catalytic site found in the connective polypeptide 1 
region (CPl), a long insertion sequence within the Rossman fold domain of Qass I enzymes 
(Schinunel, P. et al. (1998) FASEB J. 12:1599-1609), AaRSs also play a role in tRNA processing. It 

15 has been shown that mature tRNAs are charged with their respective amino acids in the nucleus 
before export to the cytoplasm, and charging may serve as a quality control mechanism to insure the 
tRNAs are functional (Martinis, S.A. et al. (1999) EMBO J. 18:4591-4596). 

Under optimal conditions, polypeptide synthesis proceeds at a rate of approximately 40 amino 
add residues per second. The rate of misincorporation during translation in on the order of 10* and is 

20 primarily tile result of aininoacyl-t-RNAs being charged with the incorrect aimno^^ Incorrecfly 
charged tRNA are toxic to cells as they result in the incorporation of incorrect amino acid residues 
mto an elongating polyp^tide. The rate of translation is presumed to be a conq>romise between tiie 
optimal rate of elongation and die need for translational fidelity. Mathematical calculations predict 
that 10"^ is indeed the maximum acceptable error rate for protein synthesis in a biological system 

25 (reviewed in Stryer, L. supra and Watson, J. et al. (1987) The Benjamin/Cummings Publishing Co., 
Inc. Menlo Park, CA). A particularly error prone aminoacyl-tRNA charguig event is tiie charging of 
tRNA®" with Ghi, A mechanism exits for the correction of this mischarging event which likely has its 
origins in evolution. Gin was among the last of the 20 naturally occurring amino acids used in 
polypeptide synthesis to appear in nature. Gram positive eubacteria, cyanobacteria, Archeae, and 

30 eukaryotic organelles possess a noncanonical pathway for tiie syntiiesis of Ghi-tRNA°^ based on tiie 
transformation of Glu-tRNA®*" (synthesized by Glu-tRNA syntfietase, OluRS) using die enzyme Glu- 
tRNA°" amidotransferase (Glu-AdT). The reactions involved in the transamidation pathway are as 
follows (Cumow, A.W. et al. (1997) Nucleic Acids Symposium 36:2-4): 
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GluRS 

tRNAO*° + Glu + ATP -*01u-tRNA«^ + AMP + PPi 
Glu-AdT 

5 Glu.tRNA*^ + GlB + ATP -^Gln-tRNAG^^' + Glu + ADP + P 

A similar enzyme, Asp-tRNA'^ amidotnmsferase, exists in Aichaea, which transforms Asp- 
tRNA^ to Asn-tRNA'^. Formylase, the enzyme that transforms Met-tRNA"^' to fMet-tRNA°^ in 
eubacteria, is likely to be a related enzyme. A hydrolytic activity has also been identified that destroys 
mischarged Val-tRNA«« (Schimmel, P. et al. (1998) FASEB J. 12:1599-1609). One likely scenario 

10 for the evolution of Olu-AdT in primitive life fomis is tiie absence of a specific glutaminyl-tRNA 
synthetase (GbiRS), requiring an alternative pathway for the syntiiesis of Gln-tRNA®". In fact, 
deletion of the Glu-AdT operon in Oram positive bacteria is lethal (Cumow, A.W. et al. (1997) Proc. 
Nati. Acad. ScLU.SA. 94:11819-11826). Theexistenceof GluRS activity in otiier organisms has . 
been inferred by the higji degree of conservation in translation machinery in nature; however, GluRS 

15 has not been identified in all organisms, including Homo sapiens . Such an enzyme would be 
responsible for ensuring translational fidelity and reducing tiie synthesis of defective polypeptides. 
In addition to their function in protem syntiiesis, specific aminoacyl tRNA synthetases also 
play roles in cellular fidelity, RNA splicing, RNA trafficking, apoptosis, and transcriptional and 
translational regulation. For example, human tyrosyl-tRNA synthetase can be proteolytically cleaved 

20 into two fragments with distinct cytokine activities. The carboxy-terminal domain exhibits monocyte 
and leukocyte chemotaxis activity as well as stimulating production of myeloperoxidase, tumor 
necrosis factor-a, and tissue factor. The N-tcrmmal domain binds to the interleukin-8 type A receptor 
and functions as an interleukin-8-like cytokine. Human tyrosyl-tRNA synthetase is secieted from 
apoptotic tumor cells and may accelerate apoptosis (Wakasugi, K., and Schimmel, P. (1999) Science 

25 284:147-151). Mitochondrial Neurospora crassa TvrRS and S. cerevisiae LeuRS are essential factors 
for certain group I intron splicing activities, and human mitochondrial LeuRS can substitute for the 
yeast LeuRS in a yeast null strain. Certain bacterial aaRSs are involved in regulating tiieir own 
transcription or translation ^fartinis, supra). Several aaRSs are able to synthesize diadenosine 
oligophosphates, a class of signalling molecules with roles in cell prolif^ation, differentiation, and 

30 apoptosis (Kisselev, L.L et al. (1998) FEBS Lett 427: 157-163; Vartanian, A. et al, (1999) FEES Lett. 
456:175-180). 

Autoantibodies against aminoacyl-tRNAs are generated by patients with autoimmune diseases 
such as rheumatic artiiritis, dermatomyositis and polymyositis, and correlate strongly witii complicating 
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interstitial lung disease (ILD) (Freist. W. et al. (1999) Biol. Chem. 380:623-646; Ercist, W. et al. 
(1996) Biol. Chem. Hoppe Seyler 377:343-356). Tliese antibodies appear to be generated in response 
to viral infection, and coxsackie virus has been used to induce experimental viral myositis in animals. 
Comparison of aaRS structures between humans and pathogens has been useful in the design 
5 of novel antibiotics (Schimmel, supra) . Genetically engineered aaRSs have been utilized to allow site- 
specific incorporation of unnatural amino acids into proteins in vivo (Uu, D.R. et al. (1997) Proc. Natl. 
Acad. Sci. USA 94:10092-10097). 
tRNA Modificatioiis 

The modified ribonucleoside, pseudouridine (y), is present ubiquitously m the anticodon regions 

10 of transfer RNAs (tRNAs), large and small ribosomal RNAs (rRNAs), and small nuclear RNAs 

(snRNAs). y is the most common of the modified nucleosides (i.e., oth^ than G, A, U, and Q present 
in tKNAs. Only a few yeast tRNAs that are not involved in protein synthesis do not contain y 
(Cortese, R. et al. (1974) J. Biol. Chem. 249:1 103-1 108). The enzyme responsible for the conversion 
of uridine to y, pseudouridine synthase ^seudouridylate synthase), was first isolated from Salmonella 

15 tYphimiirimn (Arena, F. et al. (1978) Nuc. Acids Res. S:4S23-4S36). The enzyme has since been 
isolated from a number of mammals, including steer and mice (Green, C.J. et al. (1982) J. Biol. Chem. 
257:3045-52 and Chen, J. and Patton, J.R. (1999) RNA 5:409-419). tRNA pseudouridme synthases 
have been tfie most extensively studied members of the family. They require a thiol doncn: (e.g., 
cysteine) and a monovalent cation (e.g., ammonia or potassium) for optimal activity. Additional 

20 cofactors or high energy molecules (e.g., ATP or GTP) are not required (Green, supra). Other 

eukaryotic pseudouridine syndiases have been identified that appear to be specific for rRNA (revieved 
in Smith, CM. and Steitz, J. A. (1997) Cell 89:669-672) and a dual-specificity enzyme has been 
identified that uses both tRNA and rRNA substrates (Wrzesmsld, J. et al. (1995) RNA 1: 437-448). 
The absence of y in the anticodon locq) of tRNAs results in reduced growth in both bacteria (Singer, 

25 C.B. et al. (1972) Nature New Biol. 238:72^74) and yeast (Lecomte, F. (1998) 273:1316-1323), 
although the genetic defect is not lethal. 

Another ribonucleoside modification that occurs primarily in eukaryotic cells is the conversion 
of guanosine to N^J^'-dimethylguanosine (m^G) at position 26 or 10 at the base of the D-stem of 
cytosolic and mitochondrial tRNAs. This posttranscriptional modification is believed to stabilize tRNA 

30 structure by preventing the formation of alternative tRNA secondary and tertiary structures. Yeast 
tRNA'^ is unusual m that it does not contain this modification. The modification does not occur m 
eubacteria, presumably because the structure of tRNAs in tiiese cells and organelles is sequence 
constrained and does not require posttranscriptional modification to prevent the formation of 

30 



wo 02/078420 



PCT/US02/09809 



alternative structures (Steinberg. S. and Cedergren, R. (1995) RNA 1:886-891, and references wifliin). 
The enzyme responsible far the conversion of guanosine to m\G is a 63 kDa 5-adenosylmethionine 
(SAM)-dependent tRNA J*^-dunethyl-guanosine methyltransferase (also referred to as the TRMl 
gene product and herein refened to as TRM) (Edqvist. J. (1995) Biochimie 77:54-61), Hie enzyme 
5 localizes to both the nucleus and the mitochondria (Li, J-M. et al. (1989) J. Cell Biol. 109:1411-1419). 
Based on studies with TRM from Xenopus laevis . there appears to be a requirement for base pairing 
at positions CI 1-G24 and G10-C25 immediately preceding the G26 to be modified, with oflier 
structural features of flie tRNA also being required for the propw presentation of the 026 substrate 
(Edqvist. J. et al. (1992) Nuc. Acids Res. 20:6575-81). Studies in yeast suggest that cells carrying a 
10 weak ochre tRNA suppressor (sup3-i) are unable to suppress translation termination in the absence of 
TRM activity, suggesting a role for TRM in modifying the frequency of suppression in eukaryotic cells 
(Niederberger, C. et al. (1999) FEBS Utt 464:67-70), in addition to the more general function of 
ensuring the proper duee-dimensional stractures for tRNA. 

Tran^latinn fnlriftrtnn 

15 Initiation of translation can be divided into three stages. The first stage brings an initiator 

transfer RNA (Met-tRNAf) togeflier with the 40S ribosomal subunit to form flie 43S preinitiation 
complex. The second stage binds the 43S preinitiation complex to the mRNA, followed by migration, 
of tile complex to the correct AUG mitiation codon. The third stage brings the 60S ribosomal subunit 
to the 40S subunit to generate an 80S ribosome at the inititation codon. Regulation of translation 

20 pranarily involves the first and second stage m the initiation process (V.M. Pain (1996) Eur. J. 
Biochem. 236:747-771). 

Several initiation factors, many of which contain multiple subunits, are involved in bringing an 
niitiator tRNA and the 403 ribosomal subunit together. eIF2, a guanme nucleotide binding protein, 
recruits flie initiator tRNA to the 40S ribosomal subunit. Only when eIF2 is bound to GTP does it 

25 associate with flie initiator tRNA. eIF2B, a guanine nucleotide exchange protein, is responsible for 
convertmg eIF2 from flie GDP-bound inactive form to flie GTP-bound active form. Two oflier 
factors, elFlA and eIF3 bind and stabilize the 40S subunit by interacting with flie 18S ribosomal RNA 
and specific ribosomal structural proteins. eIF3 is also involved in association of flie 408 ribosomal 
subunit wifli mRNA. The Met-tRNA^, elFlA, eIF3, and 40S ribosomal subunit together make up flie 

30 43S preinitiation complex (Pain, supra). 

Additional factors are required for binding of the 43S preinitiation complex to an mRNA 
molecule, and the process is regulated at several levels. eIF4F is a complex consisting of three 
proteins: eIF4B, eIF4A, and eIF4G. eIF4E recognizes and binds to ttie mRNA S'-terminal m^GTP 



31 



wo 02/078420 



PCT/US02/09809 



cap, eIF4A is a bidirectional RNA-dependent helicase, and eIF4G is a scaffolding polypeptide. eIF4G 
has tfaiee binding domains. The N-tenninal third of eIF4G int^acts with eIF4E, the central third 
interacts with eIF4A, and the C-terminal third interacts with elF3 bound to the 43S preinitiation 
complex. Thus, eIF4G acts as a bridge between the 40S ribosomal subunit and the mRNA (M.W. 

5 Hentze (1997) Science 275:500-501). 

Ihe ability of eIF4F to initiate binding of the 43S preinitiation complex is regulated by 
structural features of the mRNA. The mSNA molecule has an untranslated region (UTR) between 
the 5* cap and the AUG start codon. In some mRNAs this region forms secondary structures diat 
impede bindmg of the 43S preinitiation complex. The helicase activity of eIF4A is thought to function 

10 in removing flus secondary structure to facilitate binding of the 43S preinitiation complex (Pain, supra) . 
Translation Elongation 

Elongation is the process whereby additional amino acids are joined to the initiator methionine 
to form the conq)lete polypeptide chain. The elongation factors EFl a, EPl b g, and EF2 are involved 
m elongating the polypeptide chain following initiation. EFl a is a GTP-bmding protein. In EFl a*s 

15 GTP-bound form, it brings an aminoacyl-tEttJA to the ribosome' s A site. The amino add attached to 
the newly arrived aminoacyl-tENA forms a peptide bond wifli the udtiatior methionme. The OTP on 
EFl a is hydrolyzed to GDP, and EFl a -GDP dissociates from the ribosome. EFl b g binds EFl a - 
GDP and induces the dissociation of GDP from EFl a, allowing EFl a to bind GTP and a new cycle 
tobegm. 

20 As subsequent aminoacyl-tRNAs are brought to the ribosome, EF-G, another GTP-binding 

protein, catalyzes the translocation of tRNAs from die A site to the P site and finally to the E site of 
die ribosome. This allows die ribosome and die mRNA to temam attached during translation. 
Translation Term inntinn 

The release factor eRF carries out termination of translation. eRF recognizes stop codons in 

25 the mRNA, leading to the release of the polypeptide chain from the ribosome. 
Expressii^nProfilfaig 

Array technology can provide a simple way to explore the expression of a smgLe polymorphic 
gene or the expression profile of a large number of related or unrelated genes. When the expression 
of a single gene is exanuned, arrays are employed to detect the expression of a specific gene or its 
30 variants. When an expression profile is examined, arrays provide a platform for identifying genes that 
are tissue specific, are affected by a substance bemg tested in a toxicology assay, are part of a 
signaling cascade, carry out housekeeping functions, or are specifically related to a particular genetic 
predisposition, condition, disease, or disorder. 
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Expression 

Tumor necrosis factor a is a pleiotropic cytokine that mediates immmie regulation and 
inflammatory responses. TNF-a-related cytokines generate partially overlapping cellular responses, 
including differentiation, proliferation, nuclear factor-xB (NF-kB) activation, and cell death, by 

5 triggering the aggregation of receptor monomers (Smith, C.A. et al. (1994) Cell 76:959-962). The 
cellular responses triggered by TNF-a are initiated through its interaction with distinct cell surface 
receptors (TNFRs). NF-kB is a transcription factor with a pivotal role in inducing genes involved in 
physiological processes as well as in the response to injury and infection. Activation of NF-kB 
involves tiie phosphorylation and subsequent degradation of an udiibitory protein, KB, and many of tiie 

10 proximal kinases and adaptor molecules involved in this process have been elucidated. Additionally, 
the NF-kB activation pathway from cell membrane to nucleus for IL-1 and TNF-a is now understood 
(Bowie, A. and LA. ONeill (2000) Biochem. Pharmacol. 59:13-23). 

Treatment of confluent cultures of vascular smootii muscle cells (SMCs) with TNF-a 
suppresses flie incorporation of pH]proline into bodi coUagenase-digestible proteins (CDP) and 

15 noncollagenous proteins (NCP). Such suppression by TNF-a is not observed in confluent bovine 
aortic endothelial cells and human fibroblastic IMR-90 cells. TNP<l decreases die relative proportion 
of collagen types IV and V suggesting that TNF-a modulates collagen synthesis by SMCs depending 
on their cell density and therefore may modify formation of atherosclerotic lesions (Hiraga, S. et al. 
(2000) Life Sci. 66:235-244). 

20 Human aortic endotiielial cells (HAECs) are primary cells derived from die endothelium of a 

human aorta. Human iliac artery endothelial cells (HIAECs) are primary cells derived from the 
endothelium of an iliac artery. Human umbilical vein endothelial cells (HUVECs) are primary cells 
derived from the endothelium of an umbilical vein. Primary human endotiielial cell lines have been 
used as an experimental model for investigating in vitro the role of the endothelium in human vascular 

25 biology. Activation of the vascular endothelium is considered to be a central event in a wide range of 
both physiological and padiophysiological processes, such as vascular tone regulation, coagulation and 
thrombosis, atherosclerosis, and inflammation. 

Thus, vascular tissue genes differentially expressed during treatment of HAEC. HIAEC, and 
HUVEC cell cultures wifli TNFa may reasonably be expected to be markers of tiie atherosclerotic 

30 process. 

The discovery of new molecules for disease detection and treatment, and the polynucleotides 
encoding tiiem, satisfies a need in the art by providing new compositions which are useful in the 
diagnosis, prevention, and treatment of cell proliferative, autoimmune/inflammatory, developmental. 
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and neurological disorders, and in the assessment of the effects of exogenous compounds on the 
expression of nucleic acid and amino acid sequences of molecules for disease detection and treatment. 

SUMMARY OF THE INVENTION 

5 The invention features pmified polypq)tideSt molecules for disease detection and treatment, 

referred to collectively as "MDDT* and individually as ''MDDT-l," **MDDT-2," *MDDT-3," 
"MDDT-4," 'MDDT-S," 'MDDT^," "MDDT-?," 'm>DT-8," 'TVIDOT-P," "MDirr-lO," 'T^DT- 
11," "MDDT-12," *MDDT-13," 'MDDT-M," "MDDT-15," '"MDiyr-ie," 'MDDT-n." *mayr- 
18," ••MDDT-19," "MDDT-20," 'MDDT-21," ••MDDT.22," and •m>DT-23." In one aspect, the 

10 invention provides an isolated polypeptide selected from the group consisting of a) a polypeptide 
comprising an amino acid sequence selected from the group consisting of SEQ ID NO: l-23, b) a 
polypeptide comprising a naturally occurring ammo acid sequence at least 90% identical to an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-23, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 

15 ID NO: 1-23, and d) an inununogenic fragment of a polypeptide having an amino acid sequence 

selected from the group consisting of SEQ ID NO: 1-23. In one alternative, the invention provides an 
isolated polypeptide comprising the amino acid sequence of SEQ ED NO: 1-23. 

The invention further provides an isolated polynucleotide encoding a polypeptide selected from 
the group consistmg of a) a polypeptide comprising an amino acid sequence selected from the group 

20 consisting of SEQ ID NO: 1-23, b) a polypeptide comprising a naturally occurring amino acid sequence 
at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1- 
23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-23, and d) an immunogenic fragment of a polypeptide having an 
amino acid sequence selected from the group consisting of SEQ ID NO:l-23. In one alternative, the 

25 polynucleotide encodes a polypeptide selected from the group consisting of SEQ ID NO:l-23. In 
another alternative, the polynucleotide is selected from die group consistmg of SEQ ID NO:24-46. 

Additionally, the invention provides a recombinant polynucleotide comprising a promoter 
sequence operably linked to a polynucleotide encoding a polypeptide selected from the group 
consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 

30 of SEQ ID NO:l-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 
90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-23, c) a 
biologically active frragment of a polypeptide having an amino acid sequence selected from the group 
consistmg of SEQ ID NO:l-23, and d) an immunogenic fragment of a polypeptide having an amino 
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acid sequence selected from the group consisting of SEQ ID NO: 1-23. In one alternative, the 
invention provides a cell transformed witti the recombinant polynucleotide. In anothw altmiative, the 
invention provides a transgenic organism comprising die recombinant polynucleotide. 

The invention also provides a method for producing a polypeptide selected from the group 
5 consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 
of SEQ ID NO:l-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 
90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-23, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected from flie group 
consisting of SEQ ID NO:l-23, and d) an immunogenic fragment of a polypeptide having an amino 

10 acid sequence selected from flie group consisting of SEQ ID NO:l-23. The metiiod comprises a) 
cultuiing a cell under conditions suitable for expression of the polypeptide, wherein said ceil is 
transformed with a recombmant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide encoding flie polypeptide, and b) recovering flie polypeptide so e;qjressed. 

Additionally, flie invention provides an isolated antibody which specifically binds to a 

15 polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence 
selected from tiie group consisting of SEQ ID NO:l-23, b) a polypeptide comprising a naturally 
occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the 
group consisting of SEQ ID NO:l-23, c) a biologically active fragment of a polypeptide having an 
amino acid sequence selected from die group consisting of SEQ ID NO:l-23, and d) an immunogenic 

20 fragment of a polypqitide having an amino acid sequrace selected from the group consisting of SEQ 
IDNO:l-23. 

Tb& invention further provides an isolated polynucleotide selected from the group consisting of 
a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ 
ID NO:24-46, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 

25 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, 
c) a polynucleotide complementary to die polynucleotide of a), d) a polynucleotide complementary to 
flie polynucleotide of b), and e) an KNA equivalent of a)-d). In one alternative, flie polynucleotide 
comprises at least 60 contiguous nucleotides. 

Additionally, flie invention provides a mefliod for detecting a target polynucleotide in a sample, 

30 said target polynucleotide having a sequence of a polynucleotide selected from flie group consisting of 
a) a polynucleotide comprising a polynucleotide sequence selectied from flie group consisting of SEQ 
ID NO:24-46, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from flie group consisting of SEQ ID NO:24-46, 
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c) a polynucleotide complenientary to the polynucleotide of a), d) a polynucleotide complementaiy to 
the polynucleotide of b), and e) an RNA equivalent of a)<l). The method comprises a) hybridizing the 
sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence 
complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to 

5 said target polynucleotide, under conditions whereby a hybridization complex is formed between said 
probe and said target polynucleotide or fragments theieof , and b) detecting the presence or absence of 
said hybridization complex, and optionally, if present, the amount thereof. Li one alternative, the probe 
comprises at least 60 contiguous nucleotides. 

The invention further provides a method for detecting a target polynucleotide in a sample, said 

10 target polynucleotide having a sequence of a polynucleotide selected from the group consisting of a) a 
polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:24-46, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% 
identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, c) a 
polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to the 

15 polynucleotide of b), and e) an RNA equival^t of a)-d). The method comprises a) amplifying said 
target polynucleotide or fragment diereof using polymerase chain reaction amplification, and b) 
detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, 
optionally, if present, the amount thereof. 

The invention further provides a composition comprising an effective amount of a polypeptide 

20 selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-23, b) a polypeptide comprising a naturally occurring 
amino acid sequence at least 90% identical to an amino acid sequence selected from Oie group 
consisting of SEQ ID NO: 1-23, c) a biologically active fragment of a polypeptide having an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-23, and d) an mununogenic fragment of 

25 a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1 -23, 
and a pharmaceutically acceptable excipient In one embodiment, the composition comprises an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-23. The mvention additionally 
provides a metiiod of treating a disease or condition associated with decreased expression of 
frinctional MDDT, comprising administering to a patient in need of such treatment the composition. 

30 The invention also provides a method for screening a conq)6und for effectiveness as an 

agonist of a polypeptide selected fix)m die group consisting of a) a polypeptide comprising an amino 
acid sequence selected from tiie group consisting of SEQ ID NO: 1-23, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
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from the group consisting of SEQ ID NO:l-23, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO:l-23, and d) an 
immunogenic firagment of a polypeptide having an amino acid sequence selected from (he group 
consisting of SEQ ID NO: 1-23. The method comprises a) exposing a sample comprising the 

5 polypeptide to a compound, and b) detecting agonist activity in the sample. In one altmiative, the 
invention provides a composition comprising an agonist compound identified by the method and a 
pharmaceutically acceptable excipient In another alternative, the invention provides a method of 
treating a disease or condition associated with decreased expression of functional MDDT, comprising 
administering to a patient in need of such treatment tiie composition. 

10 Additionally, the invention provides a method for screening a compound for effectiveness as 

an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an 
amino acid sequence selected from the group consisting of SEQ ID NO:l-23, b) a polypeptide 
comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-23, c) a biologically active fiagment of 

15 a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-23, 
and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-23. The method conq)rises a) exposing a sample comprising the 
polypeptide to a compound, and b) detecting antagonist activity in the sample. In one alternative, the 
invention provides a composition comprising an antagonist compound identified by die method and a 

20 pharmaceutically acceptable excipient. In anotiier alternative, the invention provides a.method of 
treating a disease or condition associated witii overexpression of functional MDDT, comprising 
administering to a patient in need of such treatment the conq)osition. 

The invention further provides a method of screening for a cQnq)ound tiiat specifically binds to 
a polypeptide selected from the group consisting'of a) a polypeptide comprising an amino acid 

25 sequence selected from the group consisting of SEQ ID NO: 1-23, b) a polyp^tide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-23, c) a biologically active fragment of a polypeptide 
havmg an amino acid sequence selected from the group consisting of SEQ ID NO:l-23, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 

30 consisting of SEQ ID NO: 1-23. The method comprises a) combining the polypeptide with at least one 
test compound under suitable conditions, and b) detecting binding of the polypeptide to the test 
compound, Aerdby identifying a compound that specifically binds to the polypeptide. 

The invention further provides a method of screening for a compound that modulates the 
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activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO:l-23, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-23, c) a biologically active fragment of a polypeptide 

5 havmg an amino acid sequence selected from the group consisting of SEQ ID NO: 1-23, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-23. The method comprises a) combining the polypeptide with at least one 
test compound under conditions permissive for tiie activity of the polypeptide, b) assessing the activity 
of the polypeptide in die presence of the test compound, and c) comparing the activity of the 

10 polypeptide in the presence of the test compound with the activity of the polypeptide m die absence of 
tile test compound, wherein a change in the activity of the polypeptide in the presence of tiie test 
compound is indicative of a conqK)und that modulates the activity of flie polypeptide. 

The invention further provides a method for screening a compound for effectiveness in 
altmng expression of a target polynucleotide, wherein said tar^t polynucleotide comprises a 

15 polynucleotide sequence selected from flie group consisting of SEQ ID NO:24-46, die mefliod 
comprising a) exposing a sanq)le comprismg die target polynucleotide to a compound, b) detecting 
altered expression of the target polynucleotide, and c) comparing the expression of the target 
polynucleotide in the presence of varying amounts of the compound and m the absence of the 
conq)ound. 

20 Ihe invention further provides a metiiod for assessing toxicity of a test compound, sedd 

method comprising a) treating a biological sample containing nucleic acids wifli the test compound; b) 
hybridizing the nucleic acids of the treated biological saiiq)Ie with a probe comprising at least 20 
contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide 
comprising a polynucleotide sequence selected from flie group consisting of SEQ ID NO:24-46, ii) a 

25 polynucleotide comprising a naturally occurrmg polynucleotide sequwice at least 90% identical to a 
polynucleotide sequence selected from die group consisting of SEQ ID NO;24-46, iii) a polynucleotide 
having a sequence complementary to i), iv) a polynucleotide complementary to the polynucleotide of 
ii), and v) an RNA equivalent of i)-iv). Hybridization occurs under conditions whereby a specific 
hybridization conq>lex is formed between said probe and a target polynucleotide in die biological 

30 sample, said target polynucleotide selected from the gxmp consisting of i) a polynucleotide comprising 
a polynucleotide sequence selected from flie group consisting of SEQ ID NO:24-46, ii) a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a 
polynucleotide sequrace selected from tiie group consisting of SEQ ID NO:24-46, iii) a polynucleotide 

38 



wo 02/078420 



PCT/US02/09809 



complementaiy to the polynucleotide of i), iv) a polynucleotide complementary to the polynucleotide of 
ii), and v) an EINA equivalent of i)-;iv). Altematively, the target polynucleotide comprises a fragment 
of a polynucleotide sequence selected from the group consisting of i>.v) above; c) quantifying the 
amount of hybridization complex; and d) comparing the amount of hybridization complex in the treated 
5 biological sample with the amount of hybridization complex in an untreated biological sample, wherein 
a difference m the amount of hybridization complex in the treated biological sample is indicative of 
toxicity of the test compound. 

BRIEF DESCRIPTION OF THE TABLES 

10 Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 

sequences of the present invention. 

Table 2 shows die GenBank identification number and annotation of the nbaiest GenBank 
homolog, and the PROTEOME database identification numbers and aimotations of PR01B0ME 
database homologs, for polypeptides of the invention. The probability scores for the matches between 
15 each polypeptide and its homolog(s) are also shown. 

Table 3 shows structural features of polypeptide sequences of the invention, including 
predicted motifs and domains, along with the niiethods, algorithms, and searchable databases used for • 
analysis of the polypeptides. 

Table 4 lists the cDNA and/or genomic DNA fragments which were used to assemble 
20 polynucleotide sequences of the invention, along wifli selected fragments of the polynucleotide 
sequences. 

Table 5 shows the representative cDNA library for polynucleotides of the mvention. 

Table 6 provides an appendix which describes the tissues and vectors used for construction of 
the cDNA libraries shown m Table 5. 
25 Table 7 shows the tools, programs, and algorithms used to analyze the polynucleotides and 

polypeptides of the invention, along with applicable descriptions, references, and threshold parameters. 

Table 8 shows single nucleotide polymorphisms found in polynucleotide sequences of die 
invention, along with allele frequencies in different human populations. 

30 DESCRIPTION OF THE INVENTION 

Before the present proteins, nucleotide sequences, and methods are described, it is understood 
that this mvention is not limited to the particular machines, materials and methods described, as these 
may vary. It is also to be understood that the termmology used herein is for the purpose of describmg 



39 



wo 02/078420 



PCTAJS02/09809 



particular embodiments only, and is not intended to limit the scope of the present invention which will 
be limited only by the appended claims. 

It must be noted fhat as used herein and in the appended claims, the singular forms "a " "an," 
and "the" include plural reference unless the context clearly dictates odierwise. Thus, for example, a 
S reference to "a host cell" includes a plurality of such host cells, and a lefeience to "an antibody" is a 
reference to one or more antibodies and equivalents thereof known to those skilled in the art, and so 
forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meanings as commonly understood by one of ordinary skill m die art to which this mvention belongs. 

10 Although any machines, materials, and methods similar or equivalent to those described herem can be 
used to practice or test the present invention, the preferred machines, materials and methods are now 
described. All publications mentioned herein are cited for the purpose of describmg and disclosing the 
cell lines, protocols, reagents and vectors which are reported in the publications and which might be 
used in connection with the invention. Notiiing herein is to be construed as an admission that the 

IS invention is not entitied to antedate such disclosure by virtue of prior invention. 
DEFINITIONS 

"MDDT' refers to tiie amino add sequences of substantially purified MDDT obtained from 
any species, particularly a mammalian species, including bovine, ovine, porcine, murine, equine, and 
human, and from any source, whether natural, synthetic, semi-synthetic, or recombinant. 

20 The term "agonist" refers to a molecule which intensifies or mimics the biological activity of 

MDDT. Agonists may include proteins, nucleic acids, carbohydrates, small molecules, or any other 
compound or composition which modulates the activity of MDDT either by duecfly interacting with 
MDDT or by acting on components of die biological pathway in which MDDT participates. 

An "allelic variant* * is an alternative form of the gene encoding MDDT. Allelic variants may 

25 result firom at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in 
polypeptides whose structure or function may or may not be altered. A gene may have none, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of these types of changes may occur alone, or in combination with the ofliers, one or more times 

30 in a given sequence. 

"Altered" nucleic acid sequences encodmg MDDT include those sequences with deletions, 
insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as MDDT or a 
polypeptide witii at least one functional characteristic of MDDT. Included witiiin this definition are 
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polymorphisms which may or may not be readily detectable using a particular oKgonucleotide probe of 
the polynucleotide encoding MDDT, and improper or unexpected hyhridizatian to allelic variants, with 
a locus other than flie nonnal chromosomal locus for the polynucleotide sequence encoding MDDT. 
The encoded protein may also be "altered," and may contain deletions, insertions, or substitutions of 
5 amino acid residues which produce a silent change and result in a functionally equivalent MDDT. 

DeUberate amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, 
hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as long as the biological 
or unmunological activity of MDDT is retained. For example, negatively charged amino acids may 
include aspartic acid and glutamic acid, and positively charged amino acids may include lysine and 

10 arginine. Ammo acids with uncharged polar side chams having similar hydrophilicity values may 
include: asparagine and glutamine; and serine and threonine. Amino acids with uncharged side chains 
having similar hydrophilicity values may include: leucine, isoleucme, and valine; glycine and alanine; 
and phenylalanine and tyrosine. 

The terms "anuno acid" and "amino acid sequence" refer to an oligopeptide, peptide, 

15 polypeptide, or protein sequence, or a fragment of any of these, and to naturally occurring or synthetic 
molecules. Where "amino acid sequence" is recited to refer to a sequence of a naturally occurring 
protein molecule, "amino acid sequence" and like terms are not meant to limit the amino acid sequence 
to the complete native ammo acid sequence associated with the recited protein molecule. 

"Amplification" relates to the production of additional copies of a nucldc acid sequence. 

20 Amplification is generally carried out using polymerase chain reaction (PGR) technologies well known 
in the art. 

TbQ tenn "antagonist* refers to a molecule which inhibits or attenuates flie biological activity 
of MDDT. Antagonists mxy include proteins such as antibodies, nucleic acids, carbohydrates, small 
molecules, or any other compound or composition which modulates the activity of MDDT eith^ by 
25 directly interacting with MDDT ot by acting on components of the biological pathway in which 
MDDT participates. 

The term "antibody** refers to intact immunoglobulin molecules as well as to fragments 
thereof, such as Fab, F(ab*)2, and Fv fragments, which are capable of binding an epitopic determinant 
Antibodies that bind MDDT polypeptides can be prepared using intact polypeptides or using fragments 
30 containing small peptides of interest as the immunizmg antigen. The polypeptide or oligopeptide used 
to hnmunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of RNA, 
or synthesized chemically, and can be conjugated to a carrier protein if desired. Commonly used 
earners that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and 
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keyhole limpet hemocyanin (KLH). The coupled pq)tide is then used to inununize the animal. 

The term "antigenic determinant** refers to that region of a molecule (i.c., an epitope) that 
makes contact with a particular antibody- When a protem or a Iftagment of a protein is used to 
immunize a host anunal, numerous regions of the protein may induce the production of antibodies 

5 which bind specifically to antigenic determinants (particular regions or three-dimensional structures on 
the protein). An antigenic determinant may compete with the intact anti^n (i.e., the immunogen used 
to elicit the immune response) for binding to an antibody. 

The term "aptamer" refers to a nucleic acid or oligonucleotide molecule that binds to a 
specific molecular target Aptamers are derived fi-om an in vitro evolutionary process (e.g., SELEX 

10 (Systematic Evolution of Ligands by Exponential Enrichment), described in U.S. Patent No. 

5,270,163), which selects for target-specific aptamer sequences from large combinatorial libraries. 
Aptamer compositions may be double-stranded or single-stranded, and may include 
deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. The 
nucleotide components of an aptamer may have modified sugar groups (e.g., the 2'-0H group of a 

15 ribonucleotide may be replaced by 2 -F or 2 -NHj), which may hnprove a desired property, e.g., 
resistance to nucleases or longer lifetime in blood. Aptamers may be conjugated to other molecules, 
e.g., a high molecular weight carrier to slow clearance of the aptamer from the cuculatory system. 
Aptamers may be specifically cross-linked to then: cognate ligands, e.g., by photo-activation of a 
cross-linker. (See, e.g., Brody, B.N. and L. Gold (2000) J. Biotechnol. 74:5-13.) 

20 The term "intramer" refers to an aptamer which is expressed in vivo . For example, a vaccinia 

virus-based RNA expression system has been used to express specific RNA aptamers at high levels 
m the cytoplasm of leukocytes (Blmd, M. et al. (1999) Proc. Nafl Acad. Sci. USA 96:3606-3610). 

The term "spiegehner" refers to an aptamer which includes L-DNA, L^RNA, or other left- 
handed nucleotide derivatives or nucleotide-like molecules. Aptamers containing left-handed 

25 nucleotides are resistant to degradation by naturally occurring enzymes, which normally act on 
substrates containing right-handed nucleotides. 

Hie term "antisense" refers to any composition enable of base-paking with the "sense" 
(coding) strand of a specific nucleic acid sequence. Antisense compositions may mclude DNA; RNA; 
peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as 

30 phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified 
sugar groups such as 2 -methoxyethyl sugars or 2 -meflioxyetiioxy sugars; or oligonucleotides having 
modified bases such as 5-methyl cytosine, 2*-deoxyuracil, or 7-deaza-2'-deoxyguanosme. Antisense 
molecules may be produced by any method including chemical synthesis or transcription. Once 
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introduced into a cell, the complementary antisense molecule base-pairs widi a naturally occurring 
nucleic add sequence produced by the cell to form duplexes which block either transcription or 
translation. The designation "negative" or "minus" can refer to the antisense strand, and the 
designation •'positive" or "plus" can refer to the sense strand of a reference DNA molecule. 
5 The term *T>iologically active" refers to a protein having structural, regulatory, or biochemical 

functions of a naturally occurring molecule. Likewise, 'immunologically active" or "immunogenic" 
refers to the capability of the natural, recombinant, or synthetic MDDT, or of any oligopeptide thereof, 
to induce a specific immune response in appropriate animals or cells and to bind with specific 
antibodies. 

10 "Complementary" describes the relationship between two single-stranded nucleic acid 

sequences that anneal by base-pairing. For example, 5'-AGT-3' paks with its complement, 
3'-TCA-5\ 

A "composition comprising a given polynucleotide sequence" and a "composition comprising a 
given amino acid sequence" refer broadly to any composition containing the given polynucleotide or 

15 amino acid sequence. The composition may comprise a dry formulation or an aqueous solution. 

Compositions comprising polynucleotide sequences encodmg MDDT or fragments of MDDT may be . 
employed as hybridization probes. The probes may be stored in freeze-dried form and may be 
associated with a stabilizing agent such as a carbohydrate. In hybridizations, the probe may be 
deployed in an aqueous solution containing salts (e.g., NaQ), detergents (e.g., sodium dodecyl sulfote; 

20 SDS), and other components (e.g., Denhardf s solution, dry milk, salmon sperm DNA, etc.). 

"Consensus sequence" refers to a nucleic acid sequence which has been subjected to 
repeated DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied 
Biosystems, Foster City CA) in the 5' and/or the 3* direction, and resequenced, or which has been 
assembled from one or more overlapping cDNA, EST, or genomic DNA fragments using a computer 

25 program for fragment assembly, such as the GELVEEW fragment assembly system (GCG, Madison 
WI) or Phrap (UnivCTsify of Washmgton, Seattle WA). Some sequences have been both extended 
and assembled to produce the consensus sequence. 

"Conservative amino acid substitutions" are those substitutions that are predicted to least 
interfere with the properties of the original protein, i.e., the structure and especially the function of the 

30 protein is conserved and not significandy changed by such substitutions. The table below shows amino 
acids which may be substituted for an original amino acid in a protein and which are regarded as 
conservative amino acid substitutions. 
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Original Residue 


Conservative Substitution 




Ala 


Gly, Ser 




Arg 


His, Lys 




Asa 


Asp, Gin, His 


5 


Asp 


Asn, Glu 




Cys 


Ala, Ser 




Gin 


Asn, Glu, His 




Glu 


Asp, Gin, His 




Gly 


Ala 


10 


His 


Asn, Arg, Gin, Glu 




De 


Leu, Val 




Leu 


De,Val 




Lys 


Arg, Gin, Glu 




Met 


Leu, lie 


15 


Phe 


His, Met, Leu, Tip, Tyr 




Ser 


Cys, Thr 




Thr 


Ser, Val 




Trp 


Phe, Tyr 




Tyr 


His, Phe, Tip 


20 


Val 


De, Leu, Thr 



Conservative amino acid substitutions generally maintain (a) the structure of the polypeptide 
backbone in the area of the substitution, for example, as a beta sheet or alpha helical conf oraiation, 
(b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of 
25 the side chain. 

A "deletion" refers to a change in the amino acid or nucleotide sequence that results in the 
absence of one or more amino acid residues or nucleotides. 

The term "derivative" refers to a chemically modified polynucleotide or polypeptide. 
Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by an 
30 alkyl, acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which retains 
at least one biological or immunological function of the natural molecule, A derivative polypeptide is 
one modified by glycosylation, pegylation, or any similar process that retains at least one biological or 
immunological function of the polypeptide from which it was derived, 

A "detectable label" refers to a reporter molecule or enzyme that is capable of generating a 
35 measurable signal and is covalently or noncovalentiy joined to a polynucleotide or polypeptide. 

"Differential expression" refers to increased or upregulated; or decreased, downregulated, or 
absent gene or protein expression, determined by comparing at least two different samples. Such 
comparisons may be carried out between, for example, a treated and an untreated sample, or a 
diseased and a normal sample. 
40 •TSxon shuffling" refers to flie recombination of different coding regions (exons). Since an 
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exon may represent a structural or functional domain of the encoded protein, new proteins may be 
assembled through the novel reassortment of stable substructures, thus allowing acceleration of the 
evolution of new protein functions. 

A "fragment" is a unique portion of MDDT or the polynucleotide encoding MDDT which is 
5 identical in sequence to but shorter in length than the parent sequence. A fragment may comprise up 
to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a 
fragment may comprise from 5 to 1000 contiguous nucleotides or amino acid residues. A fragment 
used as a probe, primer, antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 15, 
16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguous nucleotides or amino acid 

10 residues in length. Fragmentsmay be preferentially selected from certain regions of a molecule. For 
example, a polypeptide fragment may comprise a certain length of contiguous amino acids selected 
from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain 
defined sequence. Qearly tiiese leng&s are exemplary, and any length that is supported by the 
specification, including the Sequence Listing, tables, and figures, may be encompassed by the present 

15 embodiments. 

A firagment of SEQ ID NO:24-46 comprises a region of unique polynucleotide sequence fliat 
specifically identifies SEQ ID NO:24-46, for example, as distinct fixnn any ottiCT sequence in flie 
genome from which the fragment was obtained. A fragment of SEQ ID NO:24-46 is useful, for 
example, in hybridization and amplification technologies and in analogous methods that distinguish SEQ 

20 ID NO:24-46 from related polynucleotide sequences. The precise lengfli of a fragment of SEQ ID 
NO:24-46 and die region of SEQ ID NO:24-46 to which the fl:agment corresponds are routinely 
determinable by one of ordinary skill m die art based on the intended purpose for die fragment 

A fiiagment of SEQ ID NO: 1-23 is encoded by a fragment of SEQ ID NO:24^. A 
fiiagment of SEQ ID NO:l-23 comprises a region of unique amino acid sequence that specifically 

25 identifies SEQ ID NO:l-23. For example, a firagment of SEQ ID NO:l-23 is useful as an 

immunogenic peptide for the development of antibodies that specifically recognize SEQ ID NO: 1-23. 
The precise lengdi of a fragment of SEQ ID NO:l-23 and the region of SEQ ID NO:l-23 tb which 
the fragment corresponds are routmely determinable by one of ordmary skill in die art based on the 
intended purpose for the fragment 

30 A "fiill lengfli'* polynucleotide sequence is one containing at least a translation initiation codon 

(e.g., methionine) followed by an open reading ftsmt and a translation termination codon. A **full 
length" polynucleotide sequence encodes a **full length" polypeptide sequence. 

"Homology" refers to sequence similarity or, interchangeably, sequence identity, between two 
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or more polynucleotide sequences or two or more polypeptide sequences. 

The terms "percent identity" and "% identity," as applied to polynucleotide sequences, refer to 
the percentage of residue matches between at least two polynucleotide sequences aligned using a 
standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in 
5 the sequences being compared in order to optimize alignm^t between two sequences, and therefore 
achieve a moie meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined usmg the default 
parameters of the CLUSTAL V algoritiun as mcorporated into the MEGAUGN version 3.12e 
sequence alignment program. This program is part of the LASERGENE software package, a suite of 
10 molecular biological analysis programs (DNASTAR, Madison WI). (XUSTAL V is described in 
ffigpns, D.G. and PM. Sharp (1989) CABIOS 5:151-153 and in Higgms, D.G. et al. (1992) CABIOS 
8:189-191. For pairwise alignments of polynucleotide sequences, the default paramet^s are set as 
follows: Ktuple=2, gap penalty=5, window=4, and "diagonals saved"=4. The 'Sveighted" residue 
weight table is selected as the default Percent identity is reported by CLUSTAL V as flie "percent 
15 similarity" between aligned polynucleotide sequences. 

Alternatively, a suite of commonly used and freely available sequence comparison algorithms 
is provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment 
Search Tool (BLAST) (Altschul, S.F. et al. (1990) J. MoL Biol. 215:403-410), which is available fiom 
severEd sources, including the NCBI, Bethesda, MD, and on the Internet at 
20 htlp://www.ncbi.nlm.nih.gov/BLAST/. The BLAST software suite includes various sequence analysis 
programs including "blastn," tirnt is used to align a known polynucleotide sequence with other 
polynucleotide sequences from a variety of databases. Also available is a tool called "BLAST 2 
Sequences" that is used for duect pairwise comparison of two nucleotide sequences. "BLAST 2 
Sequences" can be accessed and used interactively at http://www.ncbi.nlm.nih.gov/gorf/bl2.htmI. The 
25 "BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST 
programs are commonly used with gap and other parameters set to default settings. For example, to 
compare two nucleotide sequences, one may use blastn with the '*BLAST 2 Sequences" tool Version 
2.0.12 (April-21-2000) set at default parameters. Such default parameters may be, for example: 
Matrix: BLOSUM62 
30 Reward for match: 1 

Penalty for mismatch: -2 

Open Gap: 5 and Extension Gap: 2 penalties 

Gap X drop-off: 50 
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Ejqpect: 10 
Word Size: 11 
Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example, 
5 as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, 
over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous 
nucleotides. Such lengths are exemplary only, and it is understood fliat any fragment length siqjported 
by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to describe a 
10 length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
sinoilar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes 
in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
sequences that all encode substantially the same protem. 
15 The phrases '^percent identity" and "% identity," as applied to polypeptide sequences, refer to 

the percentage of residue matches between at least two polypeptide sequences aligned using a 
standardized algoridmi. Methods of polypeptide sequence alignmCTt are well-known. Some alignment 
methods take into account conservative amino acid substitutions. Such conservative substitutions, 
explained in more detail above, generally presCTve the charge and.hydrophobicity at the site of 
20 substitution, thus preservmg the structure (and therefore function) of the polypeptide. 

Percent identity between polypeptide sequences may be det^mined using the default 
parameters of the CLUSTAL V algorithm as incorporated into die MEGALIGN version 3. 12e 
sequence alignment program (described and referenced above). Fot pairwise alignments of 
polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap 
25 penalty=3, windows^, and "diagonals saved"=5. Hie PAM250 matrix is selected as the default 
residue weight table. As with polynucleotide alignments, the percent identity is reported by 
CLUSTAL V as the **percent similarity" between aligned polypeptide sequence pairs. 

Alternatively flie NCBI BLAST software suite may be used. For example, for a pairwise 
comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 
30 2.0.12 (April-21-2000) with blastp set at default parameters. Such default parameters may be, for 
example: 

. Matrix: BLOSUM62 

Open Gap: 11 and Extension Gap: 1 penalties 
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Gap X drop-off: 50 
Expect: 10 
Word Size: 3 
Filter: on 

5 Perceat identity may be measured over the length of an entiie defined polypeptide sequence, 

for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, 
for example, over the length of a ftagment taken from a larger, defined polypeptide sequence, for 
instance, a firagment of at least IS, at least 20, at least 30, at least 40, at least SO, at least 70 or at least 
ISO contiguous residues. Such lengths are exemplary only, and it is understood that any fragment 

10 length supported by the sequences shown herein, in die tables, figures or Sequence Listing, may be 
used to describe a length over which percentage identity may be measured. 

'Human artificial chromosomes" (HACs) are linear microchromosomes which may contain 
DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

15 The t^m "humanized antibody" refers to an antibody molecule in which the amino acid 

sequence in the non-antigen binding regions has been altered so that die antibody more closely 
resembles a human antibody, and still retains its original binding ability. 

''Hybridization" refers to the process by which a polynucleotide strand anneals with a 
complementary strand through base pairing und^ defined hybridization conditions. Specific 

20 hybridization is an indication that two nucleic acid sequences share a high degree of complementarity.. 
Specific hybridization complexes form under permissive annealing conditions and remain hybridized 
after the "washingi** step(s). The washing step(s) is particularly important in determining the 
stringency of the hybridization process, with more stringent conditions allowing less non-specific 
binding, i.e., binding between pairs of nucleic acid strands that are not perfectiy matched. Permissive 

25 conditions for annealing of nucleic acid sequences are routinely detrammable by one of ordinary skill in 
die art and may be consistent among hybridization experiments, whereas wash conditions may be 
varied among experiments to achieve the desued stringency, and therefore hybridization specificity. 
Permissive annealmg conditions occur, for example, at 6S°C in the presence of about 6 x SSC, about 
1% (w/v) SDS, and about 100 /ig/wi sheared, denatured salmon sperm DNA. 

30 Generally, stringency of hybridization is expressed, in part, with reference to die temperature 

under which the wash step is carried out Such wash temperatures are typically selected to be about 
5**C to 20°C lower than the thermal melting pomt (TJ for die specific sequence at a defined ionic 
strength and pH. The T„ is the temperature (under defined ionic strength and pH) at which 50% of 
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the target sequence hybridizes to a perfectly matched piobe. An equation for calculating T„ and 
conditions for nucleic acid hybridization are well known and can be found in Sambiook, J, et al. (1989) 
Molecular Cl oning: A Laboratorv Manual, 2^ ed., vol. 1-3, Cold Spring Harbor Press, Plainview NY; 
specifically see volume 2, chapter 9. 
5 IBgh stringency conditions for hybridization between polynucleotides of the present invention 

include wash conditions of eS^C in the presence of about 0.2 x SSC and about 0.1% SDS, for 1 hour. 
Alternatively, temperatures of about eS^'C. 60**C, 55T. or 42X may be used. SSC concentration may 
■ be varied from about 0.1 to 2 x SSC, with SDS being present at about 0.1%. Typically, blockmg 
reagents are used to block non-specific hybridization. Such blocking reagents include, for instance, 

10 sheared and denatured salmon sperm DNA at about 100-200 /ig/ml. Organic solvent, such as 

formamide at a concentration of about 35-50% v/v, may also be used under particular circumstances, 
such as for RNA:DNA hybridizations. Useful variations on these wash conditions will be readily 
apparent to those of ordinary skill in the art. Hybridization, particularly under high stringency 
conditions, may be suggestive of evolutionary snnilarify between Ae nucleotides. Such similarity is 

15 strongly indicative of a similar role for the nucleotides and their encoded polypeptides. 

The term ^'hybridization complex" refers to a complex formed between two nucleic acid 
sequences by virtue of the formation of hydrogen bonds between complementary bases. A 
hybridization complex may be formed in solution (e.g., Cot or Rot analysis) or fontned between one 
nucleic acid sequence present in solution and another nucleic acid sequence immobilized on a solid 

20 support (e.g., paper, membranes, filters, chips, pms or glass slides, or any other appropriate substrate 
to which cells or their nucleic acids have been fixed). 

The words 'Insertion" and "addition" refer to changes in an amino acid or nucleotide 
sequence resulting in the addition of one or more amino acid residues or nucleotides, respectively. 
"Immune response" can refer to conditions associated with inflammation, trauma, immune 

25 disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors, e.g., cytokines, chemokdnes, and otha: signaling molecules, which may affect 
cellular and syst^c defense systems. 

An "immunogenic fragment" is a polypeptide or oligopeptide fragment of MDDT which is 
capable of eliciting an immune response when introduced into a livmg organism, for example, a 

30 mammal. The term "immunogenic fragment" also includes any polypeptide or oligopeptide fragment 
of MDDT which is usefid in any of die antibody production metiiods disclosed herem or known in the 
art. 

The term "microarray" refers to an arrangement of a plurality of polynucleotides, 
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polypeptides, or other chemical conqx>unds on a substrate. 

The terms "element** and "array element" refer to a polynucleotide, polypeptide, or otiier 
chemical compound having a unique and defined position on a microarray. 

The term "modulate" refers to a change in the activity of MDDT. For example, modulation 
5 may cause an increase or a decrease m protein activity, binding characteristics, or any ottier biological, 
functional, or immunological properties of MDDT. 

Tlie phrases "nucleic acid" and **nucleic acid sequence" refer to a nucleotide, oligonucleotide, 
polynucleotide, or any fragment fliereof . These phrases also refer to DNA or RNA of genomic or 
syntiietic origin which may be single-stranded or double-stranded and may represent the sense or the 
10 antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material. 

"Operably linked" refers to die situation in which a first nucleic acid sequence is placed in a 
functional relationship with a second nucleic acid sequence. For instance, a promoter is operably 
linked to a coding sequence if die promoter affects the transcription ot expression of die coding 
sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where 
15 necessary to jom two protein coding regions, in the same reading frame. 

*Teptide nucleic acid" (FNA) refers to an antisense molecule or anti-gene agent which 
comprises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide backbone of 
amino acid residues ending in lysine. The terminal lysine confers solubility to the conq)osition. FNAs 
preferentially bmd complementary single stranded DNA or RNA and stop transcript elongation, and 
20 may be pegylated to extend their lifespan in die cell. 

"Post-translational modification" of an MDDT may involve lipidation, glycosylation, 
phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the 
art These processes may occur synthetically or biochemically. Biochemical modifications will vary 
by cell type depending on the enzymatic milieu of MDDT. 
25 "Probe" refers to nucleic acid sequences encoding MDDT, their complements, or fi-agments 

diereof, which are used to detect identical, allelic or related nucleic acid sequences. Probes are 
isolated oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. 
Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. "Primers" 
are short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target 
30 polynucleotide by complementary base-pairing. The prim^ may then be extended along the target 
DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification (and 
identification) of a nucleic acid sequence, e.g., by the polymerase chain reaction (PGR). 

Probes and primers as used in die present invention typically comprise at least 15 contiguous 
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nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers tiiat comprise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 
or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers 
may be considerably longer than these examples, and it is understood that any length supported by the 

5 specification, including the tables, figures, and Sequence Listing, may be used. 

Methods for preparing and using probes and primers are described in tiie references, for 
example Samhrook, J. et al. (1989) Molecular Cloning: A Laboratory Manual. 2™» ed., vol. 1-3, Cold 
Spring Haibor Press, Plainview NY; Ausubel, RM. et al. (1987) Current Protocols in Molecular 
Biology, Greene Publ. Assoc. & Wiley-fctersciences, New York NY; Innis, M. et al. (1990) PCR 

10 Protocols. A Guide to Methods and Applications. Academic Press, San Diego CA. PCR primer pairs 
can be derived from a known sequence, for example, by using computer prograntis intended for that 
purpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge 
MA). 

Oligonucleotides for use as primers are selected using software known in the art for such 

15 purpose. For example, OUOO 4.06 software is useful for tiie selection of PCR primer pairs of up to 
100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 5,000 
nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection 
programs have incorporated additional features for expanded capabilities. For example, flie PrimOU 
primer selection program (available to die public from flie Genome Center at University of Texas 

20 Soufli West Medical Center, Dallas TX) is cspable of choosing specific primers from megabase 
sequences and is thus useful for designing primers on a genome-wide scope. The PrimerS primer 
selection program (available to the public from the Whitehead Institute/MIT Center for Genome 
Research, Cambridge MA) allows the user to irput a "mispriming library," in which sequences to 
avoid as primer binding sites are user-specified. PrimerS is useful, in particular, for the selection of 

25 oligonucleotides for microanays. (Ihe source code for the latter two primer selection programs may 
also be obtained fix>m their respective sources and modified to meet the user's specific needs.) The 
PrimeGen program (available to the public from the UK Human Genome Mapping Project Resource 
Centre, Cambridge UK) designs primers based on multiple sequence alignments, thereby allowing 
selection of primers that hybridize to either the most conserved or least conserved regions of aligned 

30 nucleic acid sequences. Hence, this program is useful for identification of both unique and cons^ved 
oligonucleotides and polynucleotide fragments. The oligonucleotides and polynucleotide fragments 
identified by any of the above selection methods are useful in hybridization technologies, for example, 
as PCR or sequencing primers, microarray elements, or specific probes to identify fully or partially 

51 



wo 02/078420 



PCT/US02/09809 



complementary polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are 
not limited to those described above. 

A "Recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence 
that is made by an artificial combination of two or more otherwise separated segments of sequence. 

5 This artificial combination is often accomplished by chemical synthesis or, more commonly, by the 
artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques 
such as diose described in Sambrook, supra . The term recombinant includes nucleic acids that have 
been altered solely by addition, substitution, or deletion of a portion of die nucleic acid. Frequently, a 
recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter 

10 sequence. Such a recombinant nudleic acid may be part of a vector that is used, for example, to 
transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a 
vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
expressed, inducmg a protective immunological response in the mammal. 
15 A '^regulatory element" refers to a nucleic acid sequence usually derived from untranslated 

regions of a gene and includes enhancers, pn»noters, introns, and S' and 3' untranslated regions 
(UTEls). Regulatory elements interact with host or vu:al proteins which control transcriptioii, 
translation, or RNfA stability. 

"Reporter molecules" are chemical or biochemical moieties used for labeling a nucleic acid, 
20 amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 

chemiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and 
other moieties known in the art 

An "RNA equivalent," m reference to a DNA sequence, is composed of the same linear 
sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of 
25 the nitrogenous base thymine are replaced witii uracil, and the sugar backbone is composed of ribose 
instead of deoxyribose. 

The term "sample" is used m its broadest sense. A sample suspected of containing MDDT, 
nucleic acids encoding MDDT, or fragments thereof may comprise a bodily fluid; an extract from a 
cell, chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or cDNA, 
30 in solution or bound to a substrate; a tissue; a tissue print; etc. 

The terms "specific bindmg" and "specifically binding" refer to fliat interaction between a 
protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
synthetic binding composition. The interaction is dependent upon the presence of a particular structure 
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of the protein, e.g., the antigenic detenninant 6r epitope, lecognized by the binding molecule. For 
example, if an antibody is specific for epitope "A " the presence of a polypeptide comprising the 
epitope A, or the presence of £ree unlabeled A, in a reaction containing fiee labeled A and the 
antibody will reduce the amount of labeled A tibiat binds to the antibody. 

5 The tenn '^substantially purified" refers to nucleic acid or amino acid sequences that are 

removed from their natural envhronment and are isolated or separated, and are at least 60% free, 
preferably at least 75% free, and most preferably at least 90% free from other components with 
which they are naturally associated. 

A "substitution" refers to &e replacement of one or more amino acid residues or nucleotides 

10 by diffluent amino acid residues or nucleotides, respectively. 

"Substrate" refers to any suitable rigid or semi-rigid support including membranes, filters, 
chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubmg, plates, polymers, 
microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, 
trenches, pins, channels and p(ms, to which polynucleotides or polypeptides are bound. 

15 A "transcript image" or "expression profile" refers to the collective pattern of gene expression 

by a particular cell type or tissue under given conditions at a given time. 

'Transf ormation" describes a process by which exogenous DNA is introduced into a recipient 
cell. Transformation may occur under natural or artificial conditions according to various methods 
well known in the art, and may rely on any known method for the insertion of foreign nucleic acid 

20 sequences into a prokaryotic or eukaryotic host cell. The method for transformation is selected based 
on the type of host cell being transformed and may mclude, but is not limited to, bacteriophage or vkal 
infection, electroporation, heat shock, lipofection, and particle bombardment. The tmn "transformed 
cells" includes stably transformed cells in which flie inserted DNA is capable of replication either as 
an autonomously replicating plasmid or as part of the host chromosome, as well as transientiy 

25 transformed cells which express the ins^ted DNA or RNA for limited paiods of time. 

A "transgenic organism," as used herein, is any organism, including but not limited to animals 
and plants, in which one or more of the cells of the organism contains heterologous nucleic acid 
introduced by way of human int^vention, such as by transgenic techniques well known in flie art. The 
nucleic acid is introduced into the cell, direcfly or induecfly by introduction into a precursor of the cell, 

30 by way of deliberate genetic manipulation, such as by micromjection or by infection with a 
recombinant virus. In one alternative, flie nucleic acid can be introduced by infection with a 
recombinant viral vector, such as a lentiviral vector (Lois, C. et al. (2002) Science 295:868-872). Ihe 
term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but ratiier is 
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directed to the introduction of a recombinant DNA molecule. The transgenic organisms contemplated 
in accordance with the present invention include bacteria, cyanobacteria, fungi, plants and animals. 
The isolated DNA of die present invention can be introduced into the host by methods know in the 
art, for example infection, transfection, transformation or transconjugation. Techniques for 

5 transferring the DNA of the present invention into such organisms are widely known and provided in 
references such as Sambrook et al. (1989), supra . 

A **variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having 
at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 

10 1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 
93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater 
sequence identity over a certain defined length. A variant may be described as, for example, an 
"allelic" (as defined above), "spUce," "species," or "polymorphic" variant A splice variant may have 

15 significant identity to a reference molecule, but will genially have a greater or lesser number of 
polynucleotides due to alternate splicing of exons during mRNA processing. The corresponding 
polypeptide may possess additional functional domains or lack domains that are present in the 
reference molecule. Species variants are polynucleotide sequmces that vary from one species to 
another. The resulting polypeptides will genially have significant amino acid identity relative to each 

20 other. A polymorphic variant is a variation in the polynucleotide sequence of a particular gene 

between individuals of a given species. Polymorphic variants also may enconq)ass "single nucleotide 
polymorphisms" (SNPs) in which the polynucleotide sequence varies by one nucleotide base. The 
presence of SNPs may be indicative of, for example, a certain population, a disease state, or a 
propensity for a disease state. 

25 A 'Varianf ' of a particular polypeptide sequence is defined as a polypeptide sequence having 

at least 40% sequence identity to flie particular polypeptide sequence over a certain length of one of 
the polypeptide sequences using blas^ with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 

30 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence 
identity over a certain defined length of one of the polypeptides. 

THE INVENTION 
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The invention is based on the discovery of new human molecules for disease detection and 
treatment (MDDT), the polynucleotides encoding MDDT, and the use of these compositions for ttie 
diagnosis, treatment, or prevention of cell proliferative, autoimmune/inflammatory, developmental, and 
neurological disorders, and infections. 

5 Table 1 sununarizes the nomenclature for the full length polynucleotide and polypeptide 

sequences of the invention. Each polynucleotide and its corresponding polypeptide are correlated to a 
smgle Incyte project identification number (Incyte Ptoject ID). Each polypeptide sequence is denoted 
by botii a polypeptide sequence identification number (Polypeptide SEQ ID NO:) and an Incyte 
polypeptide sequence number (Incyte Polypeptide ID) as shown. Each polynucleotide sequence is 

10 denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ ID NO:) and an 
Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as shown. Column 6 
shows die Incyte ID numbers of physical, full lengfli clones corresponding to the polypq)tide and 
polynucleotide sequences of the invention. The full length clones encode polypeptides which have at 
least 95% sequence identity to flie polypeptide sequences shown in column 3. 

15 Table 2 shows sequences with homology to the polypeptides of the invention as identified by 

BLAST analysis against flie GenBank protem (genpept) database and the PROTEOME database. 
Columns 1 and 2 show die polypeptide sequence identification number (Polypeptide SEQ ID NO:) and 
the corresponding Incyte polypeptide sequence number (Incyte Polypeptide ID) for polypeptides of 
the invention. Column 3 shows flie GenBank identification number (GenBank ID NO:) of die nearest 

20 GenBank homolog and flie PROTEOME database identification numbers (PROTEOME ID NO:) of 
the nearest PROTEOME database homologs. Column 4 shows the probability scores for the matches 
between each polypeptide and its homolog(s). Column 5 shows the annotation of flie GenBank and 
PROTEOME database bomolog(s) along with relevant citations where applicable, all of which are 
expressly incoiporated by reference h^in. 

25 Table 3 shows various structural features of die polypeptides of the invCTtion. Columns 1 and 

2 show die polypeptide sequence identification number (SEQ ID NO:) and flie corresponding Incyte 
polypeptide sequence number (fiicyte Polypeptide ID) for each polypeptide of ttie mvention. Column 

3 shows the number of ammo acid residues in each polypeptide.. Column 4 shows potential 
phosphorylation sites, and column 5 shows potential glycosylation sites, as determined by flie MOTIFS 

30 program of the GCG sequence analysis software package (Genetics Computer Group, Madison WI). 
Column 6 shows amino acid residues comprising signature sequences, domains, and motifs. Column 7 
shows analytical mefliods for protein structure/fimction analysis and in some cases, searchable 
databases to which the analytical methods were applied. 
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Together, Tables 2 and 3 summarize the properties of polypeptides of the invention, and these 
properties establish fliat the claimed polypeptides are molecules for disease detection and treatment. 
For example, SEQ JD N0:1 is 42% identical, from residue Ml to residue D482, to human R052 gene 
product (GenBank ID g747927) as detranined by flie Basic Local Alignment Search Tool (BLAST). 

5 (See Table 2.) Hie BLAST probability score is 9.8e-97, which indicates flie probability of obtaining 
flie observed polypq)tide sequence aligmnent by chance. SEQ ID N0:1 also contains a SPRY 
domain, a B-box zinc finger domain, and a RING finger C3HC4 type zinc finger domain, as 
determined by searching for statistically significant matches in the hidden Markov model (HMM)- 
based EFAM database of conserved protein faniilydonudns. (See Table 3.) Data from BLIMPS, 

10 MOTIFS, and PROHLESCAN analyses provide further corroborative evidence that SEQ ID NO: 1 is 
a transcription factor. In anotiier example, SEQ ID N0:9 is 86% identical, from residue Ml lo residue 
R722, to mouse DNA binding protein DESRT (GenBank ID g9622226) as determined by tiie Basic 
Local Alignment Search Tool (BLAST). . (See Table 2.) Ihe BLAST probability score is 0.0, which 
indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ 

15 ID N0:9 also contains an ARID DNA binding domain as determined by searching for statistically 
significant matches in the hidden Markov model (HMM)-based PFAM database of conserved protein 
family domains. Data from further BLAST analyses provide further corroborative evidence that SEQ 
IDNO:9isaDNA-bindingprotein. In afurther example, SEQ ID N0:11 is 81% identical, from 
residue R8 to residue S86, to human HERV-E integrase (GenBank ID g2S87Q26) as determined by 

20 the Basic Local Alignment Search Tool (BLAST). The BLAST probability score is 2.7e-32, which 
indicates the probability of obtaining the observed polypeptide sequence alignment by chance. Data 
from BLAST analyses provide further corroborative evidence fliat SEQ ID NO: 11 is an integrase 
protease, hx yet a further example, SEQ ID N0:16 is 98% identical, from residue Ml to residue 
A928, to human prostate antigen PARIS-1 (GenBank ID gl2963885) as detennined by flie Basic 

25 Local Alignment Search Tool (BLAST). The BLAST probability score is 0.0, which mdicates tiie 
probability of obtaining die observed polypeptide sequence alignment by chance. SEQ ID NO:16 also 
contains a PH domain and a TBC domain as determined by searching for statistically significant 
matches in tiie hidden Markov model (HMM)-based PFAM database of conserved protein family 
domains. Data from BLIMPS and BLAST analyses provide furflier corroborative evidence that SEQ 

30 ID NO: 16 is a full-lengtii human protein for disease detection and treatment. SEQ ID NO:2-8, SEQ 
ID NO:10, SEQ ID NO:12-15, and SEQ ID NO:17-23 were analyzed and annotated in a similar 
manner. The algoritims and parameters for flie analysis of SEQ ID NO:l-23 are described in Table 
7. 
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As shown in Table 4, the fidl length polynucleotide sequences of the present mvention were 
assembled usmg cDNA sequences or coding (exon) sequences derived from genomic DNA, or any 
combination of these two types of sequences. Column 1 lists the polynucleotide sequence 
identification number (Polynucleotide SEQ ID NO:), the corresponding Incyte polynucleotide 

5 consensus sequence number (Incyte ID) for each polynucleotide of the invention, and the length of 
each polynucleotide sequence in basepaks. Column 2 shows the nucleotide start (5') and stop (3") 
positions of tiie cDNA and/or genomic sequences used to assemble the full length polynucleotide 
sequences of the invention, and of fragments of die polynucleotide sequences which are useful, for 
example, in hybridization or aiiq)lification technologies that identify SEQ ID NO:24-46 or that 

10 distinguish between SEQ ID NO:24-46 and related polynucleotide sequences. 

The polynucleotide fragments described in Column 2 of Table 4 may refer specifically, for 
example, to Incyte cDNAs derived from tissue-specific cDNA libraries or from pooled cDNA 
libraries. Alternatively, die polynucleotide fragments described in column 2 may refer to GenBank 
cDNAs or ESTs which contributed to tiie assembly of the fiill length polynucleotide sequences. In 

15 addition, the polynucleotide fragments described in column 2 may identify sequences derived from the 
ENSEMBL (The Sanger Centre, Cambridge, UK) database (ie., tiiose sequences including the 
designation •'ENST"). Alternatively, the polynucleotide fragments described in column 2 may be 
derived from the NCBI RefSeq Nucleotide Sequence Records Database (t.^., diose sequences 
mcluding the designation •*NM" or * W) or the NCBI RefSeq Protein Sequence Records (i.c., fliose 

20 sequences including the designation **NF')- Alternatively, the polynucleotide fragments described in 
column 2 may refer to assemblages of both cDNA and Genscan-predicted exons brought together by 
an *'exon stitching" algorithm. For example, a polynucleotide sequence identified as 
¥LjaQOaajfiJf2J^YYYYJf3Jf4 represents a "stitched" sequence in which AXmX is die 
identification nuoaber of die cluster of sequences to which die algorithm was applied, and imYis the 

25 number of the prediction generated by the algoridun, and Ni^s^, if present, represent specific exons 
that may have been manually edited during analysis (See Example V). Alternatively, die 
polynucleotide fragments in column 2 may refer to assemblages of exons brought together by an 
"exon-stretching" algorithm. For example, a polynucleotide sequence identified as 
¥UCXXXXX,_gMAAAjglBBBBBJ.JIi& a "stretched" sequence, widi XXXXXX being die fricyte 

30 project identification number, ^^AAAA being the GenBank identification number of die human 

genomic sequence to which die "exon-stretching" algoridun was applied, gJBBBBB being die GenBank 
identification number or NCBI RefSeq identification number of die nearest GenBank protein homolog, 
and referring to specific exons (See Example V). In instances where a RefSeq sequence was used 
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as a protein homolog for the "exon-sttetehing^ algoritbm. a RefSeq identifier (denoted by **NM," 
•*NP," or 'W) may be used in place of the GenBank identifier (i.e,, gBBBBE). 

Alternatively, a prefix identifies component sequences that were hand-edited, predicted from 
genomic DNA sequences, or derived fix)m a combination of sequence analysis methods. The 
5 following Table lists examples of component sequence prefixes and corresponding sequence analysis 
methods associated with the prefixes (see Example IV and Example V). 



Prefix 


Type of analysis and/or examples of programs 


GNN, GFG, 
ENST 


Exon prediction from genomic sequences using, for example, 
GENSCAN (Stanford University, CA, USA) or FGENES 
(Computer Genomics Group, Tho Sanger Centre, Cambridge, UK). 


GBI 


Hand-edited analysis of genomic sequences. 


FL 


Stitched or stretched genomic sequences (see Example V). 


INCY 


Full length transcript and exon prediction from mapping of EST 
sequences to the genome. Genomic location and EST composition 
data are combined to predict the exons and resulting transcript. 



In some cases, Incyte cDNA coverage redundant with the sequence coverage shown in 

15 Table 4 was obtained to confirm the final consensus polynucleotide sequence, but the relevant Incyte 
cDNA identification numbers are not shown. 

Table 5 shows the representative cDNA libraries for those full length polynucleotide 
sequences which were assembled using Incyte cDNA sequences. The representative cDNA library 
is the Incyte cDNA library which is most frequently represented by the Incyte cDNA sequences 

20 which were used to assemble and confirm the above polynucleotide sequences. The tissues and 
vectors which were used to construct the cDNA libraries shown in Table 5 are described in Table 6. 

Table 8 shows single nucleotide polymorphisms (SNPs) found in polynucleotide sequences of 
the invention, along with allele frequencies in different hun[ian populations. Columns 1 and 2 show the 
polynucleotide sequence identification number (SEQ ID NO:) and the conesponding Incyte project 

25 identification number (FID) for polynucleotides of the invention. Column 3 shows the Incyte 

identification number for the EST in which the SNP was detected (EST ID), and column 4 shows the 
identification number for the SNP (SNP ID). Colunm 5 shows the position within the EST sequence 
at which the SNP is located (EST SNP), and colunm 6 shows the position of the SNP within the full- 
length polynucleotide sequence (CBl SNP). Column 7 shows the allele found in the EST sequence. 

30 Columns 8 and 9 show the two alleles found at the SNP site. Column 10 shows the amino acid 
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encoded by the codon including the SNP site, based upon the allele found in the EST. Colunms 1 1-14 
show the frequency of allele 1 in four different human populations. An entry of n/d (not detected) 
indicates that the frequency of allele 1 in the population was too low to be detected, while n/a (not 
available) indicates that the allele frequency was not determined for the population. 

5 The invention also encompasses MDDT variants. A preferred MDDT variant is one which 

has at least about 80%, or alternatively at least about 90%, or even at least about 95% amino acid 
sequence identity to the MDDT amino acid sequence, and which contains at least one functional or 
structural characteristic of MDDT. 

Hie invention also encompasses polynucleotides which racode MDDT. In a particular 

10 embodiment, the invention encompasses a polynucleotide sequence comprising a sequence selected 
from flie group consisting of SEQ ID NO:24-46, which encodes MDDT. The polynucleotide 
sequences of SEQ ID NO:24-46, as presented in die Sequence Listing, embrace the equivalent RNA 
sequences, wherein occurrences of the nitrogenous base thymine are replaced witii uracil, and die 
sugar backbone is composed of ribose instead of deo?iyribose. 

15 Hie invention also encompasses a variant of a polynucleotide sequence encoding MDDT. In 

particular, such a variant polynucleotide sequence will have at least about 70%, or alternatively at least 
about 85%, or even at least about 95% polynucleotide sequence identity to the polynucleotide 
sequence encoding MDDT. A particular aspect of the invention encompasses a variant of a 
polynucleotide sequence co^^>rising a sequence selected from the group consisting of SEQ ID NO:24- 

20 46 which has at least about 70%, or altonatively at least about 85%, or even at least about 95% 

polynucleotide sequence identity to a nucleic acid sequence selected from the group consisting of SEQ 
IDNO:24^. Any one of the polynucleotide variants described above can encode an amino acid 
sequence which contams at least one functional or stmctural characteristic of MDDT. 

In addition, or in the altranative, a polynucleotide variant of tiie invention is a splice variant of a 

25 polynucleotide sequence encoding MDDT. A splice variant may have portions which have significant 
sequence identity to die polynucleotide sequence encoding MDDT, but will generally have a greater or 
lesser number of polynucleotides due to additions or deletions of blocks of sequence arising from 
alternate splicmg of exons during mKNA processing. A splice variant may have less than about 70%, 
or alternatively less than about 60%, or alternatively less than about 50% polynucleotide sequence 

30 identity to the polynucleotide sequence encoding MDDT over its entire length; however, portions of 
die splice variant will have at least about 70%, or alternatively at least about 85%, or alternatively at 
least about 95%, or alternatively 100% polynucleotide sequence identity to portions of the 
polynucleotide sequence encoding MDDT. For example, a polynucleotide comprising a sequence of 
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SEQ ID NO:25 is a splice variant of a polynucleotide comprising a sequence of SEQ ID NO:4S» and a 
polynucleotide comprising a sequence of SEQ ID NO:36 is a splice variant of a polynucleotide 
comprising a sequence of SEQ ID NO:46. Any one of the splice variants described above can 
encode an amino acid sequence which contains at least one functional or structural characteristic of 
5 MDDT. 

It will be appreciated by tiiose skilled in tiie art tiuit as a result of tiie degeneracy of tiie 
^netic code, a multitude of polynucleotide sequences encoding MDDT, some bearing minimal 
similarity to the polynucleotide sequences of any known and naturally occmring gene, may be 
produced. Ihus, the invention contemplates each and every possible variation of polynucleotide 

10 sequence that could be made by selecting combinations based on possible codon choices. These 
combinations are made in accordance with the standard triplet genetic code as applied to die 
polynucleotide sequence of naturally occurring MDDT, and all such variations are to be considered as 
being specifically disclosed. 

Althougli nucleotide sequences which encode MDDT and its variants are generally capable of 

15 hybridizing to the nucleotide sequence of the naturally occurring MDDT und^ appropriately selected 
conditions of stringency, it may be advantageous to produce nucleotide sequences encoding MDDT or 
its derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally 
occurring codons. Codons may be selected to increase the rate at which expression of the peptide 
occurs in a particular prokaryotic or eukaryotic host in accordance with the frequency widi which 

20 particular codons are utilized by the host. Otiier reasons for substantially altering the nucleotide 
sequence encoding MDDT and its derivatives without altering the encoded amino acid sequences 
include the production of RNA transcripts having more desirable properties, such as a greater half-life, 
than transcripts produced from tiie naturally occurring sequence. 

Hie invention also encompasses production of DNA sequences which encode MDDT and 

25 MDDT derivatives, or fragments tiiereof, entirely by synthetic chemistry. After production, the 

synthetic sequence may be inserted into any of the many available expression vectors and cell systems 
using reagents well known in flie art. Moreover, syntfietic chemistry may be used to introduce 
mutations into a sequence encoding MDDT or any fragment thereof. 

Also encompassed by the invention are polynucleotide sequences tiiat are capable of 

30 hybridizing to the claimed polynucleotide sequences, and, in particular, to those shown in SEQ ID 
NO:24-46 and fragments tiiereof under various conditions of stringency. (See, e.g., Wahl, G.M. and 
Si. Berger (1987) Metiiods Enzymol. 152:399-407; Kimmel, A.R. (1987) Metiiods Enzymol. 152:507- 
511.) Hybridization conditions, including annealing and wash conditions, are described in 
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Methods for DNA sequencing aie well known in the art and may be used to practice any of 
the embodiments of the invention. The methods may employ such enzymes as the Klenow fragment 
of DNA polymerase I, SEQUENASE (US Biochemical, Cleveland OH), Taq polymerase (Applied 
5 Biosystems), thermostable T7 polymerase (Amersham Pharmacia Biotech, Piscataway NJ), or 

combinations of polymerases and proofteading exonucleases such as Aose found in the ELONGASE 
anq)lificati[on system (Life Technologies, Gaithersburg MD). Preferably, sequrace preparation is 
automated with machines such as the MICROLAB 2200 liquid transfer system (Hamilton, Reno NV), 
PTC200 thermal cycler (MJ Research, Watertown MA) and ABI CATALYST 800 fliermal cycler 

10 (Applied Biosystems). Sequencing is then carried out using either the ABI 373 or 377 DNA 
sequencing system (Applied Biosystems), the MEGABACE 1000 DNA sequencing system 
(Molecular Dynamics, Sunnyvale CA), or other systems known in the art The resulting sequences 
are analyzed using a variety of algorithms which are well known in the art. (See, e.g., Ausubel, F.M. 
(1997) Short Protoc ols in Molecular Biolo^. John Wiley & Sons, New York NY, unit 7.7; MeyCTS, 

15 R.A. (1995) Molecular Bi ology and Biotechnology, Wiley VCH, New York NY, pp. 856-853.) 

The nucleic acid sequences encoding MDDT may be extended utilizing a partial nucleotide 
sequence and employing various PCR-based methods known in the art to detect upstream sequences, . 
such as promoters and regulatory elements. For example, one method which may be employed, 
- restriction-site PGR, uses universal and nested primers to amplify unknown sequence from genonuc 

20 DNA witiiin a cloning vector. (See, e.g., Sarkar, G. (1993) PGR Mefliods Applic. 2:318-322.) 

Another method, inverse PGR, uses primers that extend in divergent directions to amplify unknown 
sequence from a circularized template. The template is derived fix)m restriction fragments comprising 
a known genomic locus and surroundmg sequences. (See, e.g., Triglia, T. et al. (1988) Nucleic Acids 
Res. 16:8186.) A fliird metiiod, capture PGR, involves PGR amplij5catiou of DNA fragments adjacent 

25 to known sequences in human and yeast artificial chromosome DNA. (See, e,g., Lagerstrom, M. et 
al. (1991) PGR Methods Applic. 1:111-119.) In this method, multiple restriction enzyme digestions and 
ligations may be used to insert an engineered double-stranded sequence into a region of unknown 
sequence before performing PGR. Other methods which may be used to retrieve unknown sequences 
are known in the art (See, e.g., Parker, J.D. et al. (1991) Nucleic Acids Res. 19:3055-3060). 

30 Additionally, one may use PGR, nested primers, and PRQMOTEREINDER libraries (Qontech, Palo 
Alto CA) to walk genomic DNA. This procedure avoids the need to screen libraries and is useful in 
finding intron/exon junctions. For all PGR-based mediods, primers may be designed using 
conunercially available software, such as OUGO 4.06 primer analysis software (National 
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Biosciences, Plymouth MN) or another appropriate program, to be about 22 to 30 nucleotides in length, 
to have a GC content of about 50% or more, and to anneal to the template at temperatures of about 
68**Cto72T. 

When screemng for full length cDNAs, it is prefemble to use libraries that have been 
5 size-selected to include larger cDNAs. In addition, random-primed libraries, which often include 
sequences contaming die 5' regions of genes, are preferable for situations m which an oligo d(T) 
library does not yield a full-length cDNA. Genomic libraries may be useful for extension of sequence 
into S* non-transcribed regulatory regions. 

Capillary electrophoresis systems which are commercially available may be used to analyze 

10 the size or confirm the nucleotide sequence of sequencing or PGR products. In particular, capillary 
sequencmg may employ flowable polymers for electrpphOTetic separation, four different nucleotide- 
specrBc, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
emitted wavelengQis. Ou^ut/light intensity may be converted to electrical signal using appropriate 
software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, Applied Biosystems), and the entire 

15 process from loading of samples to computer analysis and electronic data display may be computer 
controlled. Capillary electrophoresis is especially preferable for sequencing small DNA fragments 
which may be present in limited amounts in a particular sample. 

In anotfa^ embodiment of the invention, polynucleotide sequences or fragments thereof which . 
encode MDDT may be cloned in recombinant DNA molecules that direct expression of MDDT, or 

20 fragments or functional equivalents thereof, in appropriate host cells. Due to the inherent degeneracy 
of the genetic code, other DNA sequences which encode substantially the same or a functionally 
equivalent amino acid sequence may be produced and used to express MDDT. 

The nucleotide sequences of die present invention can be engineered using methods generally 
known in the art in order to alter MDDT-encodmg sequences for a variety of purposes including, but 

25 not Ihnited to, modification of the cloning, processing, and/or expression of the gene product. DNA 
shuffling by random fragmentation and PCR reassembly of gene fragments and synthetic 
oligonucleotides may be used to engineer the nucleotide sequences. For example, oligonucleotide- 
mediated site-directed mutagenesis may be used to introduce mutations that create new restriction 
sites, alter gjycosylation patterns, change codon preference, produce splice variants, and so forth. 

30 The nucleotides of flie present invention may be subjected to DNA shuffling techniques such 

as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent No. 
5,837,458; Chang, C.-C. et al. (1999) Nat. BiotechnoL 17:793-797; Christians, RC. et al. (1999) Nat. 
Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-319) to alter or hnprove 

* 
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the biological properties of MDDT, such as its biological or enzymatic activity or its ability to bind to 
odier molecules or compounds. DNA shuffling is a process by which a library of gene variants is 
produced using PCR-mediated recombination of gene fragments. The library is then subjected to 
selection or screening procedures that identify those gene variants with the desired properties. Hiese 

5 preferred variants may then be pooled and further subjected to recursive rounds of DNA shufiQing and 
selection/screening. Thus, genetic diversity is created through ''artificial" breeding and rapid 
molecular evolution. For example, fragments of a single gene containing random point mutations may 
be recombined, screened, and then reshuffled until the desired properties are optimized. Alternatively, 
fragments of a given gene may be recombined with fragments of homologous genes in tiie same gene 

10 family, either from the same or different species, thereby maximizing the genetic diversity of multiple 
naturally occurring genes in a directed and controllable manner. 

In another embodiment, sequences encoding MDDT may be syntiiesized, in whole or in part, 
using chemical methods well known in the art (See, e.g., Caruthers, M.H. et al. (1980) Nucleic Acids 
Symp. Ser. 7:215-223; and Horn, T. et al. (1980) Nucleic Acids Symp. Ser. 7:225-232.) Alternatively, 

15 MDDT itself or a fragment thereof may be synthesized using chemical methods. For example, 

peptide synthesis can be performed using various solution-phase or solid-phase techniques. (See, e.g., 
Creighton, T. (1984) Proteins. Structures and Molecular PropCTties, WH Freeman, New York NY, pp. 
55-60; and Roberge, J.Y. et al. (1995) Science 269:202-204.) Automated syntiiesis may be achieved 
using the ABI 431 A peptide synthesizer (Applied Biosystems). Additionally, the amino acid sequence 

20 of MDDT, or any part thereof, may be altered during direct synthesis and/or combmed with sequences 
from other proteins, or any part tiiereof , to produce a variant polypeptide or a polypeptide having a 
sequence of a naturaUy occurring polypeptide. 

The peptide may be substantially purified by preparative high performance liquid 
chromatography. (See, e.g., Chiez, R.M. audF.Z. Regnier (1990) Methods EnzymoL 182:392-421.) 

25 The composition of the synthetic peptides may be confirmed by amino acid analysis or by sequencing. 
(See, e.g., Creighton, supra, pp. 28-53.) 

In order to express a biologically active MDDT, the nucleotide sequences encodmg MDDT or 
derivatives thereof may be inserted into an appropriate expression vector, i.e., a vector which contains 
the necessary elements for transcriptional and translational control of the inserted coding sequence in 

30 a suitable host These elements include regulatory sequences, such as enhancers, constitutive and 
inducible promoters, and 5' and 3* untranslated regions in die vector and in polynucleotide sequences 
encoding MDDT. Such elements may vary in their strength and specificity. Specific initiation signals 
may also be used to achieve more efficient translation of sequences ^coding MDDT. Such signals 
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include the ATO initiation codon and adjacent sequences, e.g. tlie Kozak sequence. Jn cases wheie 
sequences encoding MDDT and its initiation codon and upstream regulatory sequences are inserted 
into the appropriate expression vector, no additional transcriptional or translation^ control signals may 
be needed. However, in cases where only coduig sequence, or a fragment thereof, is inserted, 
5 exogenous translational control signals including an m-frame ATG initiation codon should be provided 
by the vector. Exogenous translational elements and initiation codons may be of various origms, both 
natural and synthetic. The efficiency of expression may be enhanced by the inclusion of enhance 
appropriate for die particular host cell system used. (See, e.g., Scharf, D. et al. (1994) Results Ftobi. 
Cell Differ. 20:125-162.) 

10 Methods which are well known to those skilled in the art may be used to construct expression 

vectors containing sequences encoding MDDT and appropriate transcriptional and translational control 
elements. Tliese methods include in vitro recombinant DNA techruques, synthetic techniques, and in 
vivo genetic recombination. (See, e.g., Sambrook, J. et al. (1989) Molecular aoninp. A Laboratary 
Manual, Cold Spring Harbor Press, Hainview NY, ch. 4, 8, and 16-17; Ausubel, ¥M, et al. (1995) 

15 Current Protocols in Molecu lar Biology. John Wiley & Sons, New Yoric NY, ch. 9, 13, and 16.) 

A variety of expression vector/host systems may be utilized to contain and express sequences 
encoding MDDT. These include, but are not limited to, microorganisms such as bacteria transformed 
widi recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with 
yeast expression vectors; insect cell systems infected wifli viral expression vectors (e.g., baculovirus); 

20 plant cell systems transformed with vfral expression vectors (e.g., cauliflower mosaic vnus, CaMV, or 
tobacco mosaic virus, TMV) or witii bacterial expression vectors (e.g., Ti or pBR322 plasmids); or 
animal cell systems. (See, e.g., Sambrook, supra : Ausubel, supra : Van Heeke, G. and S.M. Schuster 
(1989) J. Biol. Chem. 264:5503-5509; Engelhard, B.K. et al. (1994) Proc. Natl. Acad. Sci. USA 
91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO 

25 J. 6:307-31 1; The McGraw Hill Yearbook of Science and Technology (1992) McGraw EBU, New 
York NY, pp. 191-196; Logan, J. and T. Shenk (1984) ftoc. Nad. Acad. Sci. USA 81:3655-3659; and 
Harrington, J J. et al. (1997) Nat. Genet 15:345-355.) Bq)ressiQn vectors derived from retroviruses, 
adenoviruses, or herpes or vaccinia viruses, or from various bacterial plasmids, may be used for 
delivery of nucleotide sequences to the targeted organ, tissue, or cell population. (See, e.g., Di Nicola, 

30 M. et al. (1998) Cancer Gen. Ther. 5(6):350-356; Yu, M. et al. (1993) Proc. Nad. Acad. Sci. USA 
90(13):6340.6344; Bufler, R.M. et al. (1985) Nature 317(6040):813-815; McGregor, D.P, et al. (1994) 
Mol. Lnmunol. 31(3):219-226; and Verma, IM. and N. Somia (1997) Nature 389:239-242.) The 
invention is not limited by the host cell employed. 
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In bacterial systems» a number of cloning and expression vectors may be selected depending 
upon ihe use intended for polynucleotide sequences encoding MDDT. For example, routine cloning, 
subcloning, and propagation of polynucleotide sequences encoding MDDT can be achieved using a 
multifunctional E. coli vector such as PBLUESCRIPT (Stratagene. La Jolla CA) or PSPORTl 

5 plasmid (Life Technologies). Ligation of sequences encoding MDDT into the vector*s multiple cloning 
site disrupts the lacZ gene, allowing a colorimetric screening procedure for identification of 
transformed bacteria containing recombmant molecules. \sk addition, these vectors may be useful for 
iu vitro transcription, dideoxy sequencing, single strand rescue widi helper phage, and creation of 
nested deletions in the cloned sequence. (See, e.g.. Van Heeke, G. and S.M. Schuster (1989) J. Biol. 

10 Chem. 264:5503-5509.) When large quantities of MDDT are needed, e.g. for the production of 
antibodies, vectors which direct high level expression of MDDT may be used. For example, vectors 
containing the strong, inducible SP6 or T7 bacteriophage promoter may used. 

Yeast expression systems may be used for production of MDDT. A number of vectors 
containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH 

15 promoters, mav be used in the veast Saccharomvces c^evisiae or Pichia pastoris . In addition, such 
vectors direct eitiier the secretion or intracellular retention of expressed proteins and enable 
integration of foreign sequences into the host genome for stable propagation. (See, e.g., Ausubel, 
1995, supra: Bitter, G.A. et al. (1987) Mediods Enzymol. 153:516-544; and Scorer, C.A. et al. (1994) 
Bio/Technology 12: 181-184.) 

20 Plant systems may also be used for expression of MDDT. Transcription of sequences 

encoding MDDT may be driven by viral promoters, e.g., tiie 35S and 19S promoters of CaMV used 
alone or in combuiation with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 
6:307-31 1). Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock 
promotes may be used. (See, e.g., Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, R. et al. 

25 (1984) Science 224:838-843; and Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105.) These 
constructs can be introduced into plant cells by direct DNA transformation or pathogen-mediated 
transfection. (See, e.g.. The McGraw Hill Yearbook of Science and Technology (1992) McGraw BBll, 
New York NY, pp. 191-196.) 

In mammalian cells, a number of viral-based expression systems may be utilized. In cases 

30 where an adenovirus is used as an expression vector, sequences encoding MDDT may be ligated into 
an adenovirus transcription/translation complex consisting of the late promoter and tripartite leader 
sequence. Insertion in a non-essential El or £3 region of the viral genome may be used to obtain 
infective virus which expresses MDDT in host cells. (See, e.g., Logan, J. and T. Shenk (1984) Ptoc. 
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Nad. Acad. Sci. USA 81:3655-3659.) In addition, transcription enhancers, such as the Rous sarcoma 
virus (RSV) enhancer, may be used to increase expression in mammalian host cells. SV40 or EBV- 
based vectors may also be used for high-level protem e^^ression. 

Human artificial chromosomes (EIACs) may also be enq)loyed to deliver larger fragments of 
5 DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb ate 

constructed and delivered via conventional delivery methods (liposomes, polycationic amino polymers, 
or vesicles) for therapeutic purposes. (See, e.g.. Harrington, JJ. et al. (1997) Nat Genet. 15:345- 
355.) 

For long term production of recombinant proteins in manunalian systems, stable expression of 

10 MDDT m ceU lines is preferred. For example, sequences encoding MDDT can be transformed into 
cell lines usmg expression vectors which may contain viral origms of rq>Iication and/or endogenous 
expression elements and a selectable marker gene on the same or on a sqiarate vector. Following the 
introduction of the vector, cells may be allowed to grow for about 1 to 2 days in enriched media before 
being switched to selective media. Hie purpose of die selectable marker is to confer resistance to a 

15 selective agent, and its presence allows growth and recovery of cells which successfully express the 
introduced sequences. Resistant clones of stably transformed cells may be propagated using tissue 
culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex virus thymidme kinase and adenme 

20 phosphoribosyltransferase genes, for use in * and apr cells, respectively. (See, e.g., Wigler, M. et 
al. (1977) CeU 11:223-232; Lowy, L et al. (1980) Cell 22:817-823.) Also, antimetabolite, antibiotic, or 
herbicide resistance can be used as the basis for selection. For example, dhfr confers resistance to 
metiiotrexate; neo confers resistance to the aminoglycosides neomycin and G-418; and als andpat 
confer resistance to chlorsutfuron and phosphinotricin acetyltransferase, respectively. (See, e.g., 

25 Wigler, M. et al. (1980) Proc. Nafl. Acad. Sci. USA 77:3567-3570; Colbere-Gaiapm, F. et al. (1981) 
J. Mol Biol. 150:1-14.) Additional selectable genes have been described, e.g., trpB and hisD, which 
alter cellular requhements for metabolites. (See, e.g.. Hartman, S.C. and R,C. Mulligan (1988) Proc. 
Nafl. Acad, Sci. USA 85:8047-8051.) Visible markers, e.g., anthocyanins, green fluorescent proteins 
(OFF; Qontech), B glucuronidase and its substrate B-^ucuronide, or ludferase and its substrate 

30 luciferin may be used. These markers can be used not only to identify transformants, but also to 
quantify the amount of transient or stable protein expression attributable to a specific vector system. 
(See. e.g., Rhodes, C.A. (1995) Methods Mol. Biol. 55:121-131.) 

Although the presence/absence of marker gene expression suggests that the gene of interest 
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is also piesent, the presence and expiession of the gene may need to be conjBnned. For example, if 
the sequence encoding MDDT is inserted within a marker gene sequence, transformed cells 
containing sequences encoding MDDT can be identified by the absence of marker gene function. 
Alternatively, a marker gene can be placed in tandem with a sequence encoding MDDT under tibe 
5 control of a single promoter. Expression of the marker gene in response to induction or selection 
usually indicates expression of the tandem gene as well. 

In general, host cells that contain die nucleic acid sequence encoding MDDT and that express 
MDDT may be identified by a variety of procedures known to fliose of skill in the art. These 
procedures include, but are not limited to. DNA-DNA or DNA-RNA hybridizations, PGR 

10 amplification, and protein bioassay or immunoassay techniques which include membrane, solution, or 
chip based technologies for the detection and/or quantification of nucleic acid or protein sequences. 

Immunological methods for detecting and measuring die expression of MDDT using either 
specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques 
include enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and 

15 fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing 
monoclonal antibodies reactive to two non-uiterfOTng epitopes on MDDT is preferred, but a 
competitive binding assay may be employed. These and other assays are well known in the art. (See, 
e.g.. Hampton, R. et al. (1990) Serological Mediods> a LaboratDrv Manual . APS Press, St Paul MN, 
Sect IV; Ctoligan. J£. et al. (1997) Current Proto cols m Tmmminlnpry Greene Pub. Associates and 

20 Wiley-Interscience, New York NY; and Pound, J J). (1998) Immunochemical Protocols. Humana 
Press, TotowaNJ.) 

A wide variety of labels and conjugation techniques are known by those skilled in the art and 
may be used in various nucleic acid and anoino acid assays. Means for producmg labeled hybridization 
or PGR probes for detecting sequences related to polynucleotides encoding MDDT include 

25 oligolabeling, nick translation, end-labeling, or PGR amplification using a labeled nucleotide. 

Alternatively, the sequences encoding MDDT, or any fragments thereof, may be cloned into a vector 
for the production of an mRNA probe. Such vectors are known in the art, are commercially available, 
and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase 
such as T7, T3, or SP6 and labeled nucleotides. Hese procedures may be conducted using a variety 

30 of commercially available kits, such as those provided by Amersham Pharmacia Biotech, Promega 
(Madison WI), and US Biochemical. Suitable reporter molecules or labels which may be used for 
ease of detection include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic 
agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the like. 
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Host cells transf onned with nucleotide sequences encoding MDDT may be cultured under 
conditions suitable for tiie expression and recovery of the protein from cell culture. The protein 
produced by a transformed cell may be secreted or retained intracellularly depending on tiie sequence 
and/or tiie vector used. As will be understood by those of skill in flie art, expression vectors containing 

5 polynucleotides which encode MDDT may be designed to contain signal sequences which direct 
secretion of MDDT through a prokaiyotic or eukaryotic cell membrane. 

In addition, a host cell strain may be chosen for its ability to modulate expression of flie 
inserted sequences or to process the expressed protein in the desired fashion. Such modifications of 
the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, 

10 lipidation, and acylation. Post-translational processing which cleaves a '*prepro" or •'pro" form of the 
protein may also be used to specify protein targeting, folding, and/or activity. Different host cells 
which have specific cellular machinery and characteristic mechaiusms for post-translational activities 
(e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available from tiie American TVpe Culture 
CoUection (ATCC, Manassas VA) and may be chosen to ensure die correct modification and 

15 processing of the foreign protein. 

In anotiier embodiment of the invention, natural, modified, or recombinant nucleic acid 
sequences encoding MDDT may be ligated to a heterologous sequence resulting in translation of a 
fusion protein in any of the aformentioned host systems. For example, a chimeric MDDT protein 
containing a heterologous moiety that can be recognized by a commercially available antibody may 

20 facilitate the screening of peptide libraries for inhibitors of MDDT activity. Heterologous protein and 
peptide moieties may also facilitate purification of fusion proteins using commercially available affinity 
matrices. Such moieties include, but are not limited to, g^utatiuone S-transferase (GST), maltose 
binding protein (MBP), tiiioredoxin (Trx), cahnodulin bindmg pq>tide (CBP), 6-IBs, FLAG, c-myc, and 
hemagglutinin (HA). GST, MBP, Trx, CBP, and 6-His enable purification of flieir cognate fusion 

25 proteins on immobilized g^utatfaioile, maltose, phenylarsine oxide, calmodulin, and metal-chelate resins, 
respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity purification of fusion 
proteins using commercially available monoclonal and polyclonal antibodies tiiat specifically recognize 
tiiese epitope tags. A fusion protein may also be engineered to contain a proteolytic cleavage site 
located between the MDDT encoding sequence and the heterologous protein sequence, so that 

30 MDDT may be cleaved away from the heterologous moiety following purification. Methods for fusion 
protem expression and purification are discussed in Ausubel (199S, supra, ch. 10). A variety of 
commercially available kits may also be used to facilitate expression and purification of fusion 
proteins. 
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In a further embodiment of the invention, synthesis of radiolabeled MDDT may be achieved m 
vitro using the TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These 
systems couple transcription and translation of protein-coding sequences operably associated with the 
T7, T3, or SP6 promoters. Translation takes place in the presence of a radiolabeled amino acid 

5 precursor, for example, ^^S-methionine. 

MDDT of the present invention or ftagments thereof may be used to screen for compounds 
that specifically bind to MDDT. At least one and up to a plurality of test compounds may be screened 
for specific binding to MDDT. Examples of test compounds include antibodies, oligonucleotides, 
protems (e.g., receptors), or small molecules. 

10 In one embodiment, the compound thus identified is closely related to the natural ligand of 

MDDT, e.g., a ligand or fragment thereof, a natural substrate, a structural or functional mimetic, or a 
natural bindmg partner. (See, e.g., Coligan, J.E. et al. (1991) Current Protocols in hnmunologv 1(2): 
Chapter S.) Similarly, the conqpound can be closely related to the natural recqptor to which MDDT 
binds, or to at least a fragment of the receptor, e.g., the ligand binding site. In either case, the 

15 compound can be rationally designed using known techniques. In one embodiment, screening for 
these compounds involves producing appropriate cells which express MDDT, eith^ as a secreted 
protein or on the cell membrane. Preferred cells include cells from nuunmals, yeast, Drosophila. or g. 
gdi. Cells expressing MDDT or cell membrane fractions which contain MDDT are then contacted 
with a test compound and bindmg, stimulation, or inhibition of activity of either MDDT or the 

20 conq)ound is analyzed. 

An assay may simply test binding of a test compound to the polypeptide, wh^in binding is 
detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. For example, the 
assay may comprise the steps of combming at least one test compound with MDDT, either in solution 
or affixed to a solid support, and detecting the binding of MDDT to the compound. Alternatively, the 

25 assay may detect or measure bioding of a test compound in the presence of a labeled competitor. 
Additionally, the assay may be carried out using cell-free preparations, chemical libraries, or natural 
product mixtures, and the test compound(s) may be free m solution or affixed to a solid suppcat. 

MDDT of die present invention or fragments thereof may be used to screen for compounds 
that modulate the activity of MDDT. Such con^unds may include agonists, antagonists, or partial or 

30 inverse agonists. In one embodiment, an assay is performed under conditions permissive for MDDT 
activity, wherein MDDT is combined with at least one test compound, and the activity of MDDT in 
die presence of a test compound is compared with die activity of MDDT in the absence of the test 
compound. A change in the activity of MDDT in the presence of the test compound is indicative of a 
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compound that modulates the activity of MDDT. Alternatively, a test compound is combined with an 
ilLvitro or cell-firee system comprising MDDT under conditions suitable for MDDT activity, and the 
assay is performed. In either of these assays, a test compound which modulates the activity of 
MDDT may do so indirectly and need not come in direct contact with the test compound. At least 

5 one and up to a plurality of test conq)ounds noay be screened. 

In another embodiment, polynucleotides encoding MDDT or their TnamTfialiap homologs may 
be "knocked out" in an animal model system using homologous recombination m embryonic stem (ES) 
cells. Such techniques are well known in the art and are useful fOT the ^neration of animal models of 
human disease. (See, e.g., U.S. Patent No. 5,175,383 and U.S. Patent No. 5,767,337.) For exanq)le, 

10 mouse ES cells, such as the mouse 129/SvJ cell line, are derived fix>m the early mouse embryo and 
grown in culture. The ES cells are transformed with a vector containing the gene of interest disrupted 
by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science 
244:1288-1292). The vector integrates into the corresponding region of the host genome by 
homologous recombination. Alternatively, homologous recombination takes place using the Oe-loxP 

15 system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J.D. 
(1996) Clin. Invest 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330). 
Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from 
the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and 
the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous 

20 strains. Transgenic animals thus generated may be tested with potential therapeutic or toxic agents. 

Polynucleotides encoding MDDT may also be manipulated in vitro in ES cells derived from 
human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 
lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differ^tiate 
into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et al, 

25 (1998) Science 282:1145-1147). 

Polynucleotides encoding MDDT can also be used to create "knockin" humanized animals 
(pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a region 
of a polynucleotide encoding MDDT is injected into annual ES cells, and the injected sequence 
integrates into the animal cell genome. Transformed cells are injected into blastulae, and the blastulae 

30 are implanted as described above. Itensgenic progeny or inbred lines are studied and treated with 
potential pharmaceutical agents to obtain information on treatment of a human disease. Alternatively, 
a mammal inbred to overexpress MDDT, e.g., by secreting MDDT in its milk, may also serve as a 
convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74). 
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THERAPEUTICS 

Chemical and structural similarity, e.g., in the context of sequences and motifs, exists between 
regions of MDDT and molecules for disease detection and treatment. In addition, examples of tissues 
and cell lines expressing MDDT are vascular smooth muscle cells, human aortic endothelial cells, 

5 human iliac artery endothelial cells, and human umbilical vein endothelial cells, and also can be found 
in Table 6. Therefore, MDDT appears to play a role in cell proliferative, autoimmune/inflammatory, 
developmental, and neurological disorders, and infections. In the treatment of disorders associated 
with increased MDDT expression or activity, it is deskable to decrease the expression or activity of 
MDDT. In the treatment of disorders associated with decreased MDDT expression or activity, it is 

10 desirable to increase tiie e:qxression or activity of MDDT. 

Therefore, in one embodiment, MDDT or a fitigment or derivative thereof may be 
administered to a subject to treat or prevent a disorder associated witii decreased expression or 
activity of MDDT. Examples of such disorders include, but are not limited to, a cell proliferative 
disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, curhosis, hqpatitis, ndxed 

15 coimective tissue disease (MCTD), myelofibrosis, paroxysmial nocturnal hemoglobinuria, polycythemia 
vera, psoriasis, primary tfarombocytiiemia, and cancers including adenocarcinoma, leukemia, 
lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of the adrenal 
gland, bladder, bone, bone ntiarrow, brain, breast, c^rix, gall bladder, ganglia, gastrointestinal tract, 
heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 

20 spleen, testis, tiiymus, thyroid, and uterus; an autoimmune/inflammatory disorder such as inflammation, 
actinic keratosis, acquired inmoiunodeficiency syndrome (AIDS), Addison's disease, adult respkatory 
distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, artmosclerosis, asthma, 
atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, bronchitis, bursitils, 
cholecystitis, curhosis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes 

25 mellitus, emphysema, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, 
Goodpasture's syndrome, gout, Graves* disease, Hashimoto's thyroiditis, paroxysmal nocturnal 
hemoglobinuria, hepatitis, hypeieosinophilia, irritable bowel syndrome, episodic lymphopenia with 
lymphocytotoxins, mixed connective tissue disease (MCTD), multiple sclerosis, myasthenia gravis, 
myocardial or pericardial inflammation, myelofibrosis, osteoardiritis, osteoporosis, pancreatitis, 

30 polycythemia vCTa, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, 
Sjogren's syndrome, systemic anaphylaxis, systemic li5)us erythematosus, systemic sclerosis, primary 
thiombocythemia, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, autoimmune 
polyendocrinopatiiy-candidiasis-ectDdeimal dystrophy (APECED), episodic lymphopenia witii 
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lymphocytotoxins, complications of cancer, hemodialysis, and extracorporeal circulation, trauma, and 
hematopoietic cancer kicluding lymphoma, leukemia, and myeloma; a developmental disorder such as 
renal tubular acidosis, anmia, Cushing's syndrome, achondroplastic dwaifism, Duchenne and Becker 
muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wihns' tumor, aniridia, 

5 genitourinary abnormalities, and mental retardation), Smitfa-Magenis syndrome, myelodysplastic 
syndrome, hereditary mucoepithelial dysplasia, hereditary keratodermas, hereditary neuropathies such 
as Charcot-Maiie-Tootii disease and neurofibromatosis, hypothyroidism, hydrocephalus, seizure 
disorders such as Syndenham's chorea and cerebral palsy, spina bifida, anencephaly, 
craniorachischisis, congenital glaucoma, cataract, and sensorineural hearing loss; a neurological 

10 disorder such as epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's 
disease, Pick's disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal 
disorders, amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural muscular 
atrophy, retmitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, 
bacterial and viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative 

15 intracranial tiirombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion 
diseases including kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome, 
fatal familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, 
tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental 
retardation and other developmental disorder of die cmtral nervous system, cerebral palsy, a 

20 neuroskeletal disorder, an autonomic nervous system disorder, a cranial nerve disorder, a spinal cord 
disease, muscular dystrophy and other neuromuscular disorder, a peripheral nervous system disorder, 
dermatomyositis and polymyositis, inherited, metabolic, endocrine, and toxic myopathy, myasthenia 
gravis, periodic paralysis, a mental disorder including mood, anxiety, and schizophrenic disorder, 
seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive 

25 dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, and Tourette's disorder; and an 
infection, such as those caused by a vkal agent classified as adenovirus (acute respiratory disease, 
pneumonia), armavirus (lymphocytic choriomeningitis), bunyavirus (Hantavirus), calicivirus, 
coronavirus (pneumonia, chronic bronchitis), filovirus, hepadnavirus (hepatitis), herpesvirus (herpes 
simplex vmis, varicella-zoster virus, Epstein-Ban virus, cytomegalovirus), flavivirus (yellow fever), 

30 orthomyxovirus (influenza), parvovirus, papovavirus or papillomavffuse (cancer), paramyxovirus 

(measles, mumps), picomavirus (liiinovirus, poliovirus, coxsackie-virus), polyomaviruse (BK virus, JC 
virus), poxviruse (smallpox), reoviru (Colorado tick fever), retroviruse (human immunodeficiency virus, 
human T lymphotrppic virus), rhabdovinise (rabies), rotaviruse (gastroenteritis), and togavuuse 
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(encephalitis, rubella); an infection caused by a bacterial agent classified as pneumococcus, 
staphylococcus, streptococcus, bacillus, coiynebacterium, Clostridium, meningococcus, gonococcus, 
listeria, moraxella, kinge.Ua, haemophilus, legionella, bordetella, gram-negative enterobacterium 
including shigella, salmonella, or Campylobacter, pseudomonas, vibrio, brucella, francisella, yersinia, 
5 bartonella, norcardium, actinomyces, mycobacterium, spirochaetale, rickettsia, chlamydia, or 
mycoplasma; an infection caused by a fungal agent classified as aspergillus, blastomyces, 
dermatophytes, cryptococcus, coccidioides, malasezzia, histoplasma, or other mycosis-causing fungal 
agent; and an infection caused by a parasite classified as Plasmodium or malaria-causing, parasitic 
entamoeba, leishmania, trypanosoma, toxoplasma, Pneumocystis carinii, intestioal protozoa such as 

10 giardia, trichomonas, tissue nematode such as trichinella, intestinal nematode such as ascaris, 
lymphatic filarial nematode, trematode such as schistosoma, and cestode such as tapeworm. 

In another embodiment, a vector capable of expressing MDDT or a firagment or derivative 
thereof may be administered to a subject to treat or prevent a disorder associated with decreased 
expression or activity of MDDT mcluding, but not limited to, those described above. 

15 In a furtho: embodiment, a composition comprising a substantially purified MDDT in 

conjunction with a suitable pharmaceutical carrier noiay be administered to a subject to treat or prevent 
a disorder associated with decreased expression or activity of MDDT including, but not limited to, 
those provided above. 

In still another embodiment, an agonist which modulates the activity of MDDT may be 

20 administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of MDDT includmg, but not limited to, those listed above. 

In a further embodiment, an antagonist of MDDT may be administered to a subject to treat or 
prevent a disorder associated with increased expression or activity of MDDT. Examples of such 
disord^s include, but are not limited to, those cell proliferative, autoimmune/inflammatory, 

25 developmental, and neurological disorders, and infections described above. In one aspect, an antibody 
which specifically binds MDDT may be used directiy as an antagonist or indirecfly as a targeting or 
delivery mechanism for bringing a pharmaceutical agent to cells or tissues which express MDDT. 

In an additional embodiment, a vector expressing the complement of the polynucleotide 
encoding MDDT may be administered to a subject to treat or prevent a disorder associated with 

30 increased expression or activity of MDDT mcludmg, but not limited to, fliose described above. 

In other embodiments, any of the proteins, antagonists, antibodies, agonists, complementary 
sequences, or vectors of the invention may be administered in combmation with oAer appropriate 
therapeutic agents. Selection of the appropriate agents for use in combination therapy may be made 
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by one of ordinary skill in the art, according to conventional pharmaceutical principles. The 
combination of therapeutic agents may act synergistically to effect flie treatment or prevention of the 
various disorders described above. Using this ^proach, one may be able to achieve therapeutic 
efficacy with lower dosages of each agent, thus reducing the potential for adverse side effects. 
5 An antagonist of MDDT may be produced using methods which are generally known in the 

art. In particular, purified MDDT may be used to produce antibodies or to screen libraries of 
pharmaceutical agents to identify those which specifically bind MDDT. Antibodies to MDDT may 
also be generated using methods that are well known in the art Such antibodies may include, but are 
not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies. Fab fragments, and 
10 fragments produced by a Fab expression library. Neutralizmg antibodies (i.e., those which inhibit 
dimer formation) are generaUy preferred for thaapeutic use. Single chain antibodies (e.g., from 
camels or llamas) may be potent enzyme inhibitors and may have advantages m the design of peptide 
mimetics, and in the development of immuno-adsorbents and biosensors (Muyldemians, S. (2001) J. 
Biotechnol. 74:277-302). 

15 For the production of antibodies, various hosts including goats, rabbits, rats, mice, camels, 

dromedaries, llamas, humans, and others may be immunized by injection with MDDT or with any 
fragment or oligopeptide thereof which has immunogenic properties. Depending on the host species, 
various adjuvants may be used to increase unmunological response. Such adjuvants include, but are 
not limited to, Freund's, mineral gels such as aluminum hydroxide, and surface active substances such 

20 as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, KLH, and dinitrophenol. Among 
adjuvants used in humans, BCXj (bacilli Calmette-Guerin) and CQrvnebact erium paryMTn are especially 
preferable. 

It is preferred that die oligopeptides, peptides, or fragments used to induce antibodies to 
MDDT have an amino acid sequence consisting of at least about 5 amino acids, and generally will 

25 consist of at least about 10 amino acids. It is also preferable that these oligopeptides, peptides, or 
fragments are identical to a portion of the amino acid sequence of the natural protein. Short stretches 
of MDDT amino acids may be fused with those of another protein, such as KLH, and antibodies to 
the chimeric molecule may be produced. 

Monoclonal antibodies to MDDT may be prepared using any technique which provides for the 

30 production of antibody molecules by continuous ceU lines in culture. These include, but are not limited 
to, the hybridoma technique, the human B-cell hyhridoma technique, and flie EBV-hybridoma 
technique. (See, e.g., Kohler, G. et al. (1975) Nature 256:495-497; Kozbor. D. et al. (1985) J, 
Immunol. Methods 81:31-42; Cote, RJ. etal. (1983) Proc. Nad. Acad. Sci. USA 80:2026-2030; and 
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Cole, S.P. et al. (1984) Mol. CeU Biol. 62:109-120.) 

In addition, techniques developed for the production of "chimeric antibodies " such as the 
splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 
antigen specificity and biological activity, can be used. (See, e.g., Morrison, SJL. et al. (1984) Pn5c, 
5 Nad. Acad. Sci. USA 81:6851-6855; Neub^ger, M.S. et al. (1984) Nature 312:604-608; and Takeda, 
S. et al. (1985) Nature 314:452-454.) Alternatively, techniques described for the production of single 
chain antibodies may be adapted, using methods known in the art, to produce MDDT-specific single 
chain antibodies. Antibodies wifli related specificity, but of distinct idiotypic composition, may be 
generated by chain shuffling from random combinatorial immunoglobulin libraries. (See, e.g., Burton, 

10 D.R. (1991) Proc. Natt. Acad. Sci. USA 88:10134-10137.) 

Antibodies may also be produced by inducing in vivo production in the lymphocyte population 
or by screening immunoglobulin libraries or panels of highly specific binding reagents as disclosed in 
the literature. (See, e.g., Qrlandi, R. et al. (1989) Ptoc. Natl. Acad. Sci. USA 86:3833-3837; Winter, 
G. et al. (1991) Nature 349:293-299.) 

15 Antibody fragments which contain specific bmding sites for MDDT naay also be generated. 

For exanq>le, such fragments include, but are not limited to, F(ab')2 fragments produced by pepsin 
digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of 
the F(ab*)2 fragments. Alternatively, Fab expression libraries may be constracted to allow rapid and 
easy identification of monoclonal Fab fragments with the desked specificity. (See, e.g., Huse, W.D. 

20 et al. (1989) Science 246: 1275-1281.) 

Various immunoassays may be used for screening to identify antibodies jiaving the desired 
specificity. Numerous protocols for competitive binding or immunoradiometric assays using either 
polyclonal or monoclonal antibodies with established specificities are well known in the art. Such 
immunoassays typically involve the measurement of complex formation between MDDT and its 

25 specific antibody. A two-site, monoclonal-based inununoassay utilizing monoclonal antibodies reactive 
to two non-interfering MDDT epitopes is generally used, but a competitive binding assay may also be 
employed (Pound, supra). 

Various methods such as Scatohard analysis in conjunction with radioinmiunoassay techniques 
may be used to assess the afBnity of antibodies for MDDT. Affinity is expressed as an association 

30 constant, K^, which is defined as die molar concentration of MDDT-antibody complex divided by the 
molar concentrations of free antigen and free antibody under equilibrium conditions. The K, 
determined for a preparation of polyclonal antibodies, which are heterogeneous in their affinities for 
multiple MDDT epitopes, represents the average affinity, or avidity, of die antibodies for MDDT. the 
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Kg detennined for a preparation of monoclonal antibodies, which are monospecific for a particular 
MDDT epitope, represents a true measure of affinity, ffigh-affinity antibody preparations with 
ranging from about ICP to 10^^ L/mole are preferred for use in immunoassays in which the MDDT- 
antibody complex must withstand rigorous manipulations. Low-affinity antibody preparations with 
5 ranging j&rom about 10^ to IC IVmole are preferred for use in mununopurification and similar 
procedures which ultimately require dissociation of MDDT, preferably in active form, fixrai the 
antibody (Catty, D. (1988) Antibodies. Volume I: A Practical Apmt)ach. IRL Press, Washmgton DC; 
Liddell, J.E. and A. Oyer (1991) A Practical Guide to Monoclonal Antibodies. John Wiley & Sons, 
New York NY). 

10 The titer and avidity of polyclonal antibody preparations may be further evaluated to determine 

flie quality and suitabili^ of such preparations for certain downstream applications. For example, a 
polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, preferably 5-10 mg 
specific antibody/ml, is generally employed in procedures requiring precipitation of MDDT-antibody 
complexes. Procedures for evaluating antibody specificity, titer, and avidity, and guidelines for 

IS antibody quality and usage in various applications, are generally available. (See, e.g.. Catty, supra, and 
Coligan et al. supra .) 

In another embodiment of the invention, the polynucleotides encoding MDDT, or any firagment 
or complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene . 
expression can be achieved by designing complementary sequences or antisense molecules (DNA, 
20 RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding 
MDDT. Such technology is well known in the art, and antisense oligonucleotides or larger fi:agments 
can be designed from various locations along the coding or control regions of sequ^ces ^coding 
MDDT. (See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics. Humana Press Inc., Totawa 
NJ.) 

25 In tiiOTipeutic use, any gene delivery system suitable for introduction of the antisense 

sequences into appropriate target cells can be used. Antisense sequences can be delivered 
intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence 
complementary to at least a portion of the cellular sequence encoding the target protein. (See, e.g.. 
Slater, IE. et al. (1998) J. Allergy Clm. Immunol. 102(3):469-475; and Scanlon, K.J. et al. (1995) 

30 9(13):1288-1296.) Antisense sequences can also be introduced intracellularly throu^ the use of viral 
vectors, such as retrovirus and adeno-associated virus vectors. (See. e.g., Miller, A.D. (1990) Blood 
76:271; Ausubel, supra : Uckert, W. and W. Walfli^r (1994) Pharmacol. Ther. 63(3):323-347.) Oflier 
gene delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other 
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systems known in the art. (See, e.g., Rossi, JJ. (1995) Br. Med. Bull. 51(l):217-225; Boado, R.J. et 
al. (1998) J. Pham. Sci. 87(1 1):1308-1315; and Moms, M.C. et al. (1997) Nucleic Acids Res. 
25(14):2730"2736.) 

In another embodiment of the invention, polynucleotides encoding MDDT may be used for 

S somatic or gennline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency 
(e.g., in the cases of severe combined immunodeficiency (SCID)-X1 disease characterized by X- 
linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined 
immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency 
(Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), 

10 cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene 

Therapy 6:643-666; Crystal, R.O. et al. (1995) Hum. Gene Therapy 6:667-703), tfaalassamias, familial 
hypercholesterolemia, and hemophilia resulting from Factor VHI or Factor IX deficiencies (Crystal, 
R.G. (1995) Science 270:404-410; Veima, I.M. and N. Somia (1997) Nature 389:239-242)), (ii) 
express a conditionally ledial gene product (e.g., in the case of cancers which result from unregulated 

15 cell proliferation), or (iii) express a protein which affords protection against intracellular parasites (e.g., 
against human retroviruses, such as human immunodeficiency virus (HIV) (Baltimore, D. (1988) 
Nature 335:395-396; Poeschla, E. et al. (1996) Froc. Nad. Acad. Sci. USA 93:11395-11399), hepatitis 
B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides 
brasiliensis : and protozoan parasites such as P^asn^odiu^ falc^^ nim and Trypanosoma cniziV In the 

20 case where a genetic deficiency in MDDT egression or regulation causes disease, the expression of 
MDDT firom an appropriate population of transduced cells may alleviate the clinical manifestations 
caused by the genetic deficiency. 

In a further embodiment of the invention, diseases or disord^ caused by deficiencies in 
MDDT are treated by constructing mammalian expression vectors encoding MDDT and introducing 

25 these vectors by mechanical means into MDDT-deficient cells. Mechanical transfer technologies for 
use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic 
gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene transfer, and 
(v) flie use of DNA transposons (Morgan, R.A. and W.F. Anderson (1993) Annu. Rev. Biochem. 
62:191-217; Ivies, Z. (1997) CeU 91:501-510; Boulay, J-L. andH. R&ipon (1998) Curr. Opin. 

30 Biotechnol. 9:445-450). 

Expression vectors that may be effective for die expression of MDDT include, but are not 
limited to, die PCDNA 3.1. EPITAO, PRCCMV2, PREP, PVAX, PCR2-TOPOTA vectors 
(Invitrogen, Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene. La Jolla CA), 
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and PTET-OHF. FTET-ON. PTRE2. PTRB2.LUC, PTK-HYG (Clontech, Palo Alto CA). MDDT 
may be expressed using (i) a constitutively active promoter, (e.g., from cytomegaloviras (CMV), Rous 
sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or P-actin genes), (ii) an inducible promoter 
(e.g., the tetracycline-regulated promoter (Gossen, M. and H. Bujard (1992) Proc. Natl. Acad. Sci. 

5 USA 89:5547-5551; Gossen, M. et al. (1995) Science 268:1766-1769; Rossi, F.M.V. and H.M. Blau 
(1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid (Invitrogen)); 
the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; Invitrogen); the 
FKS06/rapamycin inducible promoter, or the RU486/mifepristone inducible promoter (Rossi, F.M.V. 
and H.M. Blau, supra^ V or (iii) a tissue-specific promoter or the native promoter of the endogenous 

10 gene encoding MDDT from a normal individual. 

Commercially available liposome transformation kits (e.g., the PERFECT LIPID 
TRANSFECnON KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize expmmental 
parameters. In die alternative, transformation is performed using the calcium phosphate method 

15 (Graham, F.L. and A.J. K> (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. 
(1982) EMBO J, 1 :841-845). The introduction of DNA to primary cells requires modification of tiiese 
standardized mammalian transfection protocols. 

In another embodiment of tiie invention, diseases or disorders caused by genetic defects with 
respect to MDDT expression are treated by constructing a retrovirus vector consisting of (i) the 

20 polynucleotide encoding MDDT under tiie control of an independent promote or the retrovirus long 
terminal repeat (LTR) promoter, (ii) £q)propriate RNA packaging signals, and (iii) a Rev-responsive 
element (RRE) along witb additional retrovirus cu-acting RNA sequences and coding sequences 
required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are 
commercially available (Stratagaae) and are based on published data (Riviere, I. et al. (1995) Proc. 

25 Nafl. Acad. Sci. USA 92:6733-6737), incorporated by reference herein. Hie vector is propagated in 
an appropriate vector producing cell line (VPCL) that expresses an envelope gene witii a tropism for 
receptors on the target cells or a promiscuous envelope protein such as VSVg (Annentano, D. et al. 
(1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. Vkol 61:1639-1646; Adam, M.A. and 
A.D. Miller (1988) J. Virol. 62:3802-3806; Dull, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. et 

30 al. (1998) J. Virol. 72:9873-9880). U.S. Patent No. 5,910,434 to Rigg C^Metfiod for obtaining 

retrovirus packaging cell lines producing high transducing efficiency retroviral supematanf*) discloses 
a method for obtaining retrovirus packaging cell lines and is hereby incorporated by reference. 
Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4+ T-cells), and the 
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return of transduced cells to a patient are procedures well known to persons skilled in the art of gene 
therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71:7020-7029; Bauer, G. et 
al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 71:4707-4716; Ranga, U. et al. (1998) 
Proc. Nad. Acad. Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283-2290). 

5 In the alternative, an adenovirus-based gene therapy delivery system is used to deliver 

polynucleotides encoding MDDT to cells which have one or more genetic abnormalities with respect 
to tiie expression of MDDT. The construction and packaging of adenovirus-based vectors are well 
known to diose with ordinary skill in the art Replication defective adenovirus vectors have proven to 
be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas' 

10 (Csete, MJE. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are 
described in U.S. Patent No. 5,707,618 to Armentauo ("Adenovirus vectors for gene flippy**), 
hereby incorporated by reference. For adenoviral vectors, see also Antinozzi, P.A. et al. (1999) 
Annu. Rev. Nutr. 19:51 1-544 and Verma. LM. and N. Somia (1997) Nature 18:389:239-242. both 
incorporated by referrace herein. 

15 In another alternative, a herpes-based, gene dierapy delivery system is used to deliver 

polynucleotides encoding MDDT to target cells which have one or mote genetic abnormalities with 
respect to &e expression of MDDT. The use of herpes simplex vkus (HSV)-based vectors may be 
especially valuable for introducing MDDT to cells of die central nervous system, for which HSV has a 
tropisoL The construction and packaging of herpes-based vectors are well known to those with 

20 cttdinary skill in the art. A repUcation-competent herpes simplex virus (HSV) type 1-based vector has 
been used to deUver a reporter gene to the eyes of primates (Uu, X. et al. (1999) Exp. Eye Res. 
169:385-395). The construction of a HSV-1 virus vector has also been disclosed m detail m U.S. 
Patent No. 5,804,413 to DeLuca C*Herpes simplex virus strains for gene transfer")f which is hereby 
incorporated by reference. U.S. Patent No. 5,804,413 teaches the use of recombinant HSV d92 

25 which consists of a genome containing at least one exogenous gene to be transferred to a cell under 
the control of the appropriate promote for purposes including human gene therapy. Also tau^t by 
this patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and 
ICP22. For HSV vectors, see also Goins, W.F. et al. (1999) J. Virol. 73:519-532 and Xu, H. et al. 
(1994) Dev. Biol. 163:152-161, hereby incorporated by reference. The manipulation of cloned 

30 herpesvirus sequences, the generation of recombinant virus following the transfection of multiple 
plasmids containing different segments of the large herpesvirus genomes, the growth and propagation 
of herpesvirus, and the infection of cells with herpesvirus are techniques well known to those of 
ordinary skill in the art. 
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Id another alternative, an alpbavirus (positive, single-stranded RNA virus) vector is used to 
deliver polynucleotides encoding MDDT to target cells. The biology of the prototypic alphavirus, 
Sendiki Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based 
on the SFV gpnome (Garoff, H. and K.-J. Li (1998) Curr. 0pm. Biotechnol. 9:464-469). During 

5 alphavirus RNA replication, a subgenomic RNA is generated that nonnally encodes the viral capsid 
proteins. This subgenonuc RNA replicates to higher levels than the full length genomic RNA, 
resulting in the overproduction of capsid proteins relative to tiie vhral proteins with enzymatic activity 
(e.g., protease and polymerase). Similarly, inserting the coding sequence for MDDT into the 
alphavirus genome in place of the capsid-coding region results in the production of a large number of 

10 MDDT-coding RNAs and flie synfliesis of high levels of MDDT in vector transduced cells. While 
alphavirus infection is typically associated with cell lysis within a few days, tiie abihty to establish a 
persistent^ infection m hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) 
indicates that the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy 
application (Dryga, S.A. et al. (1997) Virology 228:74-83). The wide host range of alphaviruses will 

IS allow the introduction of MDDT into a variety, of cell types. The specific transduction of a subset of 
' cells in a population may require the sorting of cells prior to transduction. The methods of 
manipulating infectious cDNA clones of alphaviruses, performing alphavkus cDNA and RNA 
transfectibns, and p^orming alphavirus infections, are well known to those with ordinary skill in the 
art. 

20 Oligonucleotides derived from the transcription initiation site, e.g., between about positions -10 

and +10 from the start site, may also be employed to inhibit gene expression. Similarly, inhibition can 
be achieved using triple helix base-pahing methodology. Triple helix pairing is usefiil because it 
causes inhibition of the ability of the double helix to open sufficienfly for die bmding of polymerases, 
transcription factors, or regulatory molecules. Recent therapeutic advances using triplex DNA have 

25 been described in tiie literature. (See, e.g., Gee, J.E. et al. (1994) m Huber, B.E. and B.I. Carr, 
Molecular and Tmn^uiif^l ngic Approaches . Futura Publishing, Mt. Kisco NY, pp. 163-177.) A 
complementary sequence or antisense molecule may also be designed to block translation of mRNA 
by preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 

30 RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme 
molecule to complementary target RNA, followed by endonucleolytic cleavage. For example, 
engineered hammerhead motif ribozyme molecules may specifically and efficientiy catalyze 
endonucleolytic cleavage of sequ^ces encoding MDDT. 
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Specific ribozyme cleavage sites within any potential RNA target are initially identified by 
scanning die tai;get molecule for ribozyme cleavage sites* including the following sequences: GUA, 
GUU, and GUC. Once identified, short SNA sequences of between IS and 20 ribonucleotides, 
corresponding to die region of the target gene containing the cleavage site, may be evaluated for 

5 secondaxy structural features which may render the oligonucleotide inoperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and ribozymes of the invention may be prepared 
by any method known in die art for the synthesis of nucleic acid molecules. These include techniques 

10 for chemically synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. 
Alternatively, RNA molecules may be generated by in vitro and in vivo transcription of DNA 
sequences encoding MDDT. Such DNA sequences may be incorporated into a wide variety of 
vectors with suitable RNA polymerase promoters such as T7 or SP6. Alternatively, these cDNA 
constructs that synthesize complementary RNA, constitutively or iuducibly, can be introduced into cell 

15 Imes, cells, or tissues. 

RNA molecules may be modified to increase intracellular stability and half-life. Possible 
modifications include, but are not limited to, the addition of flanking sequences at die 5' and/or 3' ends; 
of die molecule, or die use of phosphorothioate or 2* O-methyl rather dian phosphodiesterase linkages 
within the backbone of the molecule. This concept is inherent in the production of FNAs and can be 

20 extended in all of these molecules by the inclusion of nontraditional bases such as inosine, queosine, 
and wybutosine, as well as acetyl-, mediyl-, tiiio-, and similarly modified forms of adenine, cytidine, 
guanine, thymine, and uridine which are not as easily recognized by endogenous endonucleases. 

An additional embodiment of die invention enconq>asses a method for screening for a 
compound which is effective in altering expression of a polynucleotide encoding MDDT. Compounds 

25 which may be effective in altering expression of a specific polynucleotide may include, but are not 
limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming oligonucleotides, 
transcription factors and other polypeptide transcriptional regulators, and non-macromolecular 
chemical entities which are capable of interacting with specific polynucleotide sequences. Effective 
compounds may alter polynucleotide expression by acting as either inhibitors or promoters of 

30 polynucleotide expression. Thus, in die treatment of disorders associated widi increased MDDT 
expression or activity, a compound which specifically inhibits expression of the polynucleotide 
encoding MDDT may be therapeutically useful, and in the treatment of disorders associated with 
decreased MDDT expression or activity, a compound which specifically promotes expression of the 
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polynucleotide encoding MDDT may be therapeutically useful 

At least one, and up to a plurality, of test compounds may be screened for effectiveness in 
altering expression of a specific polynucleotide. A test compound may be obtained by any method 
conmionly known in the art, including chemical modification of a compound known to be effective in 

. 5 alt^ng polynucleotide expression; selection firom an existing, commercially-available or proprietary 
library of naturally-occurring or non-natural chemical compounds; rational design of a compound 
based on chemical and/or structural properties of the target polynucleotide; and selection from a 
library of chemical corrq)ounds created combinatorially or randomly. A sample comprising a 
polynucleotide encoding MDDT is exposed to at least one test compound thus obtained. Hie sample 

10 may comprise, for example, an intact or penneabilized cell, or an in vitro cell-free or reconstituted 
biochemical system Alterations in the expression of a polynucleotide encoding MDDT are assayed 
by any method commonly known in the art Typically, the expression of a specific nucleotide is 
detected by hybridization with a probe having a nucleotide sequence complementary to the sequence 
of the polynucleotide encoding MDDT. The amount of hybridization may be quantified, thus foraiing 

15 the basis for a comparison of the expression of the polynucleotide both widi and without exposure to 
one or more test compounds. Detection of a change in die expression of a polynucleotide exposed to 
a test compound indicates that the test compound is effective in altering the expression of the 
polynucleotide. A screen for a compound effective in altering expression of a specific polynucleotide 
can be carried out, for example, using a Schizosaccharomvces pombe gene expression system (Afldns, 

20 D. et al. (1999) U,S. Patent No. 5,932,435; Amdt, G.M, et al. (20(X)) Nucleic Acids Res. 28:E15) or a 
human cell line such as HeLa cell (Clarke, Mi. et al. (2000) Biochem. Biophys. Res. Commun. 
268:8-13). A particular embodiment of the present invention involves screening a combinatorial library 
of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide nucleic acids, and modified 
oligonucleotides) for antisense activity against a specific polynucleotide sequence (Bruice, T.W, et al. 

25 (1997) U.S. Patent No. 5.686,242; Bruice, T.W. et al. (2000) U.S. Patent No. 6,022,691). 

Many methods for introducing vectors into cells or tissues are available and equally suitable 
for use in vivo, in vitro, and ex vivo . For ex vivo therapy, vectors may be introduced into stem cells 
taken from the patient and clonally propagated for autologous transplant back into that same patient. 
Delivery by transfection, by liposome injections, or by polycationic amino polymers may be achieved 

30 using mefliods which are weU known m the art (See. e.g., Goldman, C.K et al. (1997) Nat 
Biotechnol. 15:462^6.) 

Any of the therapeutic methods described above may be applied to any subject in need of 
such therapy, including, for example, mammals such as humans, dogs, cats, cows, horses, rabbits, and 
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monkeys. 

An additional embodiment of the invention relates to the administration of a composition which 
generally comprises an active ingredient formulated witii a pharmaceutically acceptable excipient 
Excipients may include, for example, sugars, starches, celluloses, gums, and proteins. Various 
5 formulations are commonly known and are fliorougihly discussed in the latest edition of Remington's 
Pharmaceutical Sciences (Maack Publishing, Easton PA). Such compositions may consist of MDDT, 
antibodies to MDDT, and mimetics, agonists, antagonists, or inhibitors of MDDT. 

The compositions utilized in this invention may be administered by any number of routes 
including, but not limited to, oral, mtravenous, intramuscular, intra-arterial, intramedullary, intrathecal, 

10 intraventricular, pulmonary, transdmnal, subcutaneous, intraperitoneal, intranasal, enteral, topical, 
sublingual, or rectal means. 

Compositions for pulmonary administration may be prepared in liquid or dry powder form. 
These compositions are generally aerosolized immediately prior to inhalation by the patient In the 
case of small molecules (e.g. traditional low molecular weigjlit organic drugs), aerosol delivery of fast- 

15 acting formulations is well-known in the art In the case of macromolecules (e.g. larger peptides and 
proteins), recent developments in the field of puhnonary delivery via the alveolar region of the lung 
have enabled the practical delivery of drugs such as insulin to blood circulation (see, e.g., Patton, J.S. 
et al., U.S. Patent No. 5,997,848). Pulmonary delivery has the advantage of administration without 
needle injection, and obviates the need for potentially toxic penetration enhancers. 

20 Compositions suitable for use in the invention include compositions wherein the active 

ingredients are contained in an effective amount to achieve the intended purpose. Hie determination 
of an effective dose is well within the capability of those skilled in the art 

Specialized forms of compositions may be prepared for direct intracellular deUvery of 
macromolecules con^sing MDDT or fragments thereof. For example, liposome preparations 

25 containing a cell-impermeable macromolecule may promote cell fusion and intracellular delivery of the 
macromolecule. Alternatively, MDDT or a fragment fliereof may be joined to a short cationic N- 
terminal portion from die HIV Tat-1 protein. Fusion proteins thus genemted have been found to 
transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S.R. et 
al. (1999) Science 285:1569-1572). 

30 For any compound, the therapeutically effective dose can be estimated initially either in cell 

culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, monkeys, 
or pigs. An animal model may also be used to determine die appropriate concentration range and 
route of administration. Such mformation can tiien be used to determine useftil doses and routes for 
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admimstration in humans. 

A therapeutically effective dose refers to that amount of active ingredient, for example 
MDDT or fragments thereof, antibodies of MDDT, and agonists, antagonists or inhibitors of MDDT, 
which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be detranined 

5 by standard pharmaceutical procedures in cell cultures or with experimental animals, such as by 
calculating die ED50 (the dose therapeutically effective in 50% of tiie population) or LD50 (the dose 
letiial to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is tiie 
therapeutic index, which can be expressed as tiie IDs/BDso ratio. Compositions which exhibit large 
dierapeutic indices are preferred. The data obtained from cell culture assays and animal studies are 

10 used to formulate a range of dosage for human use. Hie dosage contained in such compositions is 
preferably within a range of circulating concentrations that mcludes the ED^q with littie or no toxicity. 
The dosage varies within this range depending upon the dosage form employed, the sensitivity of die 
patient, and the route of administration. 

The exact dosage wiU be det^mined by the practitioner, in light of factors related to die 

15 subject requiring treatment Dosage and administration are adjusted to provide sufficient levels of the 
active moiety or to maintain the desked effect Factors which may be taken into account include the 
severity of ttie disease state, the general healtti of the subject, die age, weight, and gender of die 
subject, time and frequency of administration, drug combination(s), reaction sensitivities, and response 
to therapy. Long-acting compositions may be administered every 3 to 4 days, every week, or 

20 biweekly depending on tiie half-life and clearance rate of the particular formulation. 

Normal dosage amounts may vaiy from about 0.1 //g to 100,000 /zg, up to a total dose of 
about 1 gram, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners m the art. 
Those skilled in the art will employ different formulations fot nucleotides than for proteins or their 

25 inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, etc. 
DIAGNOSTICS 

In another embodiment, antibodies which specifically bmd MDDT may be used for die 
diagnosis of disorders characterized by expression of MDDT, or in assays to monitor patients being 
30 treated wifli MDDT or agonists, antagonists, or inhibitors of MDDT. Antibodies useful for diagnostic 
purposes may be prepared in the same maimer as described above for therapeutics. Diagnostic 
assays for MDDT include methods which utilize die antibody and a label to detect MDDT in human 
body fluids or in extracts of cells or tissues. The antibodies may be used with or without modification. 
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and may be labeled by covalent or non-covalent attachment of a reporter molecule. A wide variety of 
reporter molecules, several of which are described above, are known in die art and may be used. 

A variety of protocols for measuring MDDT, including EUSAs. RIAs, and FACS, are known 
in the art and provide a basis for diagnosing altered or abnormal levels of MDDT expression. Normal 
5 or standard values for MDDT expression are established by combining body fluids or cell extracts 
taken from normal mammalian subjects, for example, human subjects, with antibodies to MDDT under 
cOTditions suitable for complex formation. The amount of standard conq>lex formation may be 
quantitated by various methods, such as photometric means. Quantities of MDDT expressed in 
subject, control, and disease samples from biopsied tissues are conq>ared with the standard values. 

10 Deviation between standard and subject values establishes the parameters for diagnosing disease. 

In another embodimrat of the invention, the polynucleotides encoding MDDT may be used for 
diagnostic purposes. The polynucleotides which may be used include oligonucleotide sequences, 
conq>lementary RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect 
and quantify gene expression in biopsied tissues in which expression of MDDT may be correlated with 

15 disease. The diagnostic assay may be used to determine absence, presence, and excess expression of 
MDDT, and to monitor regulation of MDDT levels during flierapeutic intervention. 

In one aspect, hybridization with PGR probes which are capable of detecting polynucleotide 
sequences, including genomic sequences, encoding MDDT or closely related molecules noay be used 
to identify nucleic acid sequences which encode MDDT. The specificity of the probe, whether it is 

20 made from a highly specific region, e.g., the 5' regulatory region, or from a less specific region, e.g., a 
conserved motif, and the stringency of the hybridization or an^lification will determine whether the 
probe identifies only naturally occurring sequences encoding MDDT, allelic variants, or related 
sequences. 

Probes may also be used for the detection of related sequences, and may have at least 50% 
25 sequence identity to any of the MDDT encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO:24-46 or from 
genomic sequences including promoters, enhancers, and introns of the MDDT gene. 

Means for producing specific hybridization probes for DNAs encoding MDDT include die 
cloning of polynucleotide sequences encoding MDDT or MDDT derivatives into vectors for the 
30 production of mRNA probes. Such vectors are known in flie art, are commercially available, and may 
be used to synthesize RNA probes in vitro by means of the addition of the appropriate RNA 
polymerases and the appropriate labeled nucleotides. Hybridization probes may be labeled by a 
variety of reporter groups, for example, by radionuclides such as or "S, or by enzymatic labels. 
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such as alkaline phosphatase coupled to the probe via avidiu/biotin coupling systems, and the like. 

Polynucleotide sequences encoding MDDT may be used for the diagnosis of disorders 
associated with expression of MDDT. Examples of such disorders include, but are not limited to. a 
cell proliferative disorder such as actinic keratosis, arteriosclerosis, adierosclerosis, bursitis, cirrhosis, 
5 hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal 

hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers including 
adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in 
particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall 
bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, 

10 penis, prostate, salivary glands, skm, spleen, testis, thymus, diyroid, and uterus; an 

autoimmune^nflammatory disorder such as inflammation, actinic keratosis, acquired immunodeficiency 
syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing 
spondylitis, amyloidosis, anemia, arterioscl^osis, asthma, atherosclerosis, autoimmune hemolytic 
anemia, autoimmune thyroiditis, bronchitis, bursitis, cholecystitis, cirrhosis, contact dermatitis, Crohn's 

IS disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, erythroblastosis fetalis, 
erydiema nodosum, atrophic gastritis, glomerulonq)hritis, Gooc^asture's syndrome, gout. Graves' 
disease, Hashimoto's thyroiditis, paroxysmal nocturnal hemoglobinuria, hepatitis, hypereosinophilia, 
irritable bowel syndrome, ^isodic lymphopenia with lymphocytotoxins, mixed connective tissue 
disease (MCTD), multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, 

20 myelofibrosis, osteoarthritis, osteoporosis, pancreatitis, polycythemia vera, polymyositis, psoriasis, 
Reiter's syndrome, rheumatoid arthritis, scleroderma, Sj5gren's syndnmie, systemic anaphylaxis, 
systemic lupus ^ytibematosus, systemic sclerosis, primary thrombocythemia, thrombocytopenic 
purpura, ulcerative colitis, uveitis, Wemer syndrome, autoimmune polyendocrinopathy-candidiasis- 
ectodem^ dystrophy (APECED), episodic lymphopenia with lymphocytotoxins, complications of 

25 cancer, hemodialysis, and extracorporeal circulation, trauma, and hematopoietic cancer including 
lymphoma, leukemia, and myeloma; a developmental disorder such as renal mbular acidosis, anemia, 
Cushing's syndrome, achondroplastic dwarfism, Duchenne and Beckor muscular dystrophy, epilepsy, 
gonadal dysgenesis, WAGR syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental 
retardation), Smith-Ma^nis syndrome, myelodysplastic syndrome, hereditary mucoepithelial dysplasia, 

30 hereditary keratodermas, hereditary neuropathies such as Charcot-Marie-Tooth disease and 

neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders such as Syndenham*s chorea and 
cerebral palsy, spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and 
sensorineural hearing loss; a neurological disorder such as epilepsy, ischemic cerebrovascular disease, 
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Stroke, cerebral neoplasms, Alzheimer's disease. Pick's disease, Hmitington's disease, dementia, 
Parkinson's disease and other extrapyramidal disorders, amyotrophic lateral sclerosis and other motor 
neuron disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple 
sclerosis and other demyelinatmg diseases, bacterial and vhral meningitis, brain abscess, subdural 

5 empyema, epidural abscess, suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral 
central nervous system disease, prion diseases including kuru, Creutzfeldt- Jakob disease, and 
Gerstmann-Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases 
of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, 
encephalotrigeminal syndrome, mental retardation and other developnaental disorder of the central 

10 nervous system, cerebral palsy, a neuroskeletal disord^, an autonomic nervous system disorder, a 
cranial nerve disorder, a spinal cord disease, muscular dystrophy and other neuromuscular disorder, a 
peripheral nervous system disorder, dermatomyositis and polymyositis, inherited, metabolic, endocrine, 
and toxic myopathy, myasthenia gravis, periodic paralysis, a mental disorder includmg mood, anxiety, 
and schizophrenic disorder, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic 

IS neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, and Tourette's 
disorder; and an infection, such as those caused by a viral agent classified as adenovirus (acute 
respiratory disease, pneumonia), arenavuus (lymphocytic choriomeningitis), bunyavirus (Hantavirus), 
calicivffus, coronavirus (pneumonia, chronic bronchitis), filovkus, hepadnavirus (hepatitis), herpesvkus 
(herpes simplex virus, varicella-zoster vkus, Epstein-Barr virus, cytomegalovirus), flavivirus (yellow 

20 fever), orthomyxovirus (influens^), parvovirus, papovavirus or papillomaviruse (cancer), 

paramyxovirus (measles, mmiq)s), picomavinis (rhinovirus, poliovirus, coxsackie-virus), polyomavunse 
(BK virus, JC virus), poxviruse (smallpox), reoviru (Colorado tick fever), retroviruse (human 
immunodeficiency vuiis, human T lymphotropic virus), rhabdoviruse (rabies), rotaviruse 
(gastroenteritis), and togaviruse (encephalitis, rubella); an infection caused by a bacterial agent 

25 classified as pneumococcus, staphylococcus, streptococcus, bacillus, corynebacterium, Clostridium, 
meningococcus, gonococcus, listeria, moraxella, kingella, haemopbilus, legionella, bordetella, gram- 
negative enterobacterium including shigella, salmonella, or Campylobacter, pseudomonas, vibrio, 
brucella, francisella, yersinia, bartonella, norcardium, acttnomyces, mycobacterium, spirochaetale, 
rickettsia, chlamydia, or mycoplasma; an infection caused by a fungal agent classified as aspergillus, 

30 blastomyces, dermatophytes, cryptococcus, coccidioides, malasezzia, histoplasma, or other mycosis- 
causing fiingal agent; and an infection caused by a parasite classified as Plasmodium or malaria- 
causing, parasitic entamoeba, leishmania, trypanosoma, toxoplasma, Pneumocystis carinii, intestinal 
protozoa such as giardia, trichomonas, tissue nematode such as trichinella, intestinal nematode such as 
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ascaris, lymphatic filarial nematode, trematode such as schistosoma, and cestode such as tapewonn. 
Hie polynucleotide sequences encoding MDDT may be used in Southern or northern analysis, dot blot, 
or other membrane-based technologies; in ?CR technologies; in dipstick, pin, and multiformat ELISA* 
like assays; and in microairays utilizing fluids or tissues ficom patients to detect altered MDDT 

5 expression. Such qualitative or quantitative methods are well known in the art. 

In a particular aspect, the nucleotide sequences encoding MDDT may be useful in assays that 
detect the presence of associated disorders, particularly those mentioned above. The nucleotide 
sequences encoding MDDT may be labeled by standard methods and added to a fluid or tissue sample 
firom a patient under conditions suitable for the formation of hybridization complexes. After a suitable 

10 incubation period, the sample is washed and die signal is quantified and compared with a standard 
value. If die amount of signal in the patient sample is significantly altered in comparison to a control 
sample dien die presence of altered levels of nucleotide sequences encoding MDDT in the sample 
indicates die presence of the associated disorder. Such assays may also be used to evaluate the 
efficacy of a particular diempeutic treatment regunen in animal studies, in clinical trials, or to monitor 

15 die treatment of an individual patient 

In order to provide a basis for die diagnosis of a disorder associated widi expression of 
MDDT, a normal or standard profile for expression is established. This may be accomplished by 
combining body fluids or cell extracts taken from normal subjects, either animal or human, with a 
sequence, or a fragment thereof, encoding MDDT, under conditions suitable for hybridization or 

20 amplification. Standard hybridization may be quantified by comparing die values obtained firom normal 
subjects with values firom an experiment in which a known amount of a substantially purified 
polynucleotide is used. Standard values obtained in this manner may be compared with values 
obtained from sanq)les fix)m patients who are symptomatic for a disorder. Deviation firom standard 
values is used to establish the presence of a disorder. 

25 Once die presence of a disorder is established and a treatment protocol is initiated, 

hybridization assays may be repeated on a regular basis to determine if the level of expression in the 
patient begins to approximate that which is observed in the normal subject The results obtained from 
successive assays may be used to show die efflcacy of treatment over a period rangmg firom several 
days to months. 

30 With respect to cancer, the presence of an abnormal amount of transcript (either under- or 

overexpressed) m biopsied tissue firom an individual may indicate a predisposition for the development 
of the disease, or may provide a means for detecting the disease prior to die appearance of actual 
clinical symptoms. A more definitive diagnosis of this type may allow health professionals to employ 
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preventative measures or aggressive treatment earlier thereby preventing the development or further 
progression of the cancer. 

Additional diagnostic uses for oligonucleotides designed from die sequences encoding MDDT 
may involve the use of PCR. These oligomers may be chemically synthesized, generated 

5 enzymatically, or produced in vitro . Oligomers will preferably contain a fragment of a polynucleotide 
encoding MDDT, or a fragment of a polynucleotide complementary to the polynucleotide encoding 
MDDT, and will be employed under optimized conditions for identification of a specific gene or 
condition. Oligomers may also be employed under less stringent conditions for detection or 
quantification of closely related DNA or RNA sequences. 

10 In a particular aspect, oligonucleotide primers dmved fiom the polynucleotide sequences 

encoding MDDT may be used to detect single nucleotide polymorphisms (SNPs). SNPs are 
substitutions, insertions and deletions that are a fiequent cause of inherited or acquired genetic disease 
in humans. Mediods of SNP detection include, but are not limited to, single-stranded conformation 
polymorphism (SSCP) and fluorescent SSCP (fSSCP) methods. In SSCT, oligonucleotide primers 

IS derived from the polynucleotide sequences encoding MDDT are used to amplify DNA using the 
polymerase chain reaction (PCR). The DNA may be derived, for example, from diseased or normal 
tissue, biopsy sanq)les, bodily fluids, and the like. SNPs in the DNA cause differences in tiie 
secondary and t^Hary structures of PCR products in single-stranded form, and these differences are 
detectable using gel electrophoresis in non-denaturing gels. In fSCCP, die oligonucleotide primers are 

20 fluorescentiy labeled, which allows detection of the amplimers in high-throughput equipment such as 
DNA sequencing machines. Additionally, sequence database analysis methods, termed in silico SNP 
(isSNP), are capable of identifying polymorphisms by comparing the sequence of individual 
overlapping DNA fragments which assemble into a common consensus sequence. Hiese computer- 
based methods filter out sequence variations due to laboratory preparation of DNA and sequencing 

25 errors using statistical models and automated analyses of DNA sequence chromatograms. In die 
alternative, SNPs may be detected and characterized by mass spectrometry using, for example, the 
high throughput MASSARRAY system (Sequenom, Lie, San Diego CA). 

SNPs may be used to study the genetic basis of human disease. For example, at least 16 
common SNPs have been associated with non-insulin-dependent diabetes mellitus. SNPs are also 

30 useful for examming differences in disease outcomes m monogenic disorders, such as cystic fibrosis, 
sickle cell anemia, or chronic granulomatous disease. For example, variants in the mannose-binding 
lectin, MBL2, have been shown to be correlated with deleterious pulmonary outcomes in cystic 
fibrosis. SNPs also have utility m pharmacogenomics, the identification of genetic variants tiiat 
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influence a patient's response to a drug, such as life-tbieatening toxicity. For example, a variation in 
N-acetyl transferase is associated >vith a high incidence of peripheral neuropathy in response to fiie 
anti-tuberculosis drug isoniazid, while a variation in the core promoter of the AL0X5 gene lesults in 
diminished clinical response to treatment with an anti-astfama drug that targets the S-lipoxygenase 

5 pathway. Analysis of the distribution of SNPs in different populations is useful for investigatmg 
genetic drift, mutation, recombination, and selection, as well as for tracing the origins of populations 
and their migrations. (Taylor, J.G. et al. (2001) Trends Mol. Med. 7:507-512; Kwok, P.-Y. and Z. Gu 
(1999) Mol. Med. Today 5:538-543; Nowotny, P. et al. (2001) Curr. Opin. Neurobiol. 11:637-641.) 

Methods which may also be used to quantify the expression of MDDT include radiolabeling or 

10 biotinylating nucleotides, coamplification of a control nucleic acid, and interpolating results ftom 
standard curves. (See, e.g., Melby, P.C. et al. (1993) J. Immunol. Methods 159:235-244; Duplaa, C. 
et al. (1993) Anal. Biochem. 212:229-236.) The speed of quantitation of multiple samples may be 
accelerated by running the assay in a hig^-throughput format where the oligomer or polynucleotide of 
interest is presented in various dilutions and a spectrophotometric or colorimetric response gives rapid 

15 quantitation. 

In furtiier embodiments, oligonucleotides or longer fragments derived fix>m any of the 
polynucleotide sequences described herein may be used as element^ on a microairay. Hie microarray 
can be used in transcript imaging techniques which monitor the relative expression levels of large 
numb^ of genes simultaneously as described below. The microarray may also be used to identify 

20 genetic variants, mutations, and polymorphisms. This information may be used to determine gene 
function, to understand the genetic basis of a disorder, to diagnose a disorder, to monitor 
progression/regression of disease as a function of gene expression, and to develop and monitor tiie 
activities of therapeutic agents m the treatment of disease. In particular, this information may be used 
to develop a pharmacograomic proffle of a patient in order to select the most appropriate and effective 

25 treatment regimen for fliat patient For example, tiierapeutic agents which are higihly effective and 
display the fewest side effects may be selected for a patient based on his/her pharmacogenomic 
profile. 

In anofliCT embodiment, MDDT, fragments of MDDT, or antibodies specific for MDDT may 
be used as elements on a microarray. The microarray may be used to monitor or measure protein- 
30 protem interactions, drug-target interactions, and gene expression profiles, as described above. 

A particular embodiment relates to the use of the polynucleotides of the present invention to 
generate a transcript image of a tissue or cell type. A transcript unage represents tiie global pattern of 
gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by 
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quantifying the number of expressed genes and their relative abundance under given conditions and at 
a given time. (See Seilhamer et al., "Comparative Gene Transcript Analysis," U.S. Patent No. 
5,840,484, expressly incorporated by reference herein.) Thus a transcript image may be gen^ted by 
hybridizing the polynucleotides of die present invention or their complements to the totality of 

5 transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, die 
hybridization takes place in high-tiuroughput fomiat, wherein the polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. Hie 
resultant transcript image would provide a profile of gene activity. 

Transcript images may be generated using transcripts isolated from tissues, cell lines, biopsies, 

10 or other biological samples. Hie transcript image may thus reflect gene expression in vivo, as in the 
case of a tissue or biopsy sample, or in vitro, as m the case of a cell line. 

Transcript unages which profile die expression of the polynucleotides of the present invention 
may also be used in conjunction with in vitro model systems and preclinical evaluation of 
pharmaceuticals, as well as toxicological testmg of industrial and naturally-occurring environmental 

15 compounds. All compounds induce characteristic gene expression patterns, frequenfly termed 

molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and toxicity 
(Nuwaysir, RF. et al. (1999) Mol. Carcinog. 24:153-159; Steiner, S. and Ni. Anderson (2000) 
Toxicol. Lett 1 12*1 13:467-471, expressly incorporated by reference herein), ff a test compound has a 
signature similar to that of a compound with known toxicity, it is likely to share those toxic properties. 

20 These fingerprints or signatures are most useful and refined when they contain expression information 
from a large number of ^nes and gene families. Ideally, a genome-wide measurement of expression 
provides the highest quality signature. Even genes whose expression is not altered by any tested 
compounds are important as well, as the levels of expression of these genes are used to normalize the 
rest of the expression data. The normalization procedure is useful for comparison of expression data 

25 after treatment with different compounds. While die assignment of g^ne function to elements of a 
toxicant signature aids in interpretation of toxicity mechanisms, knowledge of gene function is not 
necessary for the statistical matching of signatures which leads to prediction of toxicity. (See, for 
example. Press Release 00-02 from die National Institute of Envuronmental Health Sciences, released 
February 29, 2000, available at http://www.niehs.nih.gov/oc/news/toxchip.htm.) Therefore, it is 

30 important and desirable in toxicological screening using toxicant signatures to include all expressed 
gene sequences. 

In one embodiment, the toxicity of a test compound is assessed by treating a biological sample 
contaming nucleic acids with the test confound. Nucleic acids that are expressed in die treated 
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biological sample are hybridized with one or more probes specific to the polynucleotides of the present 
invention, so that transcript levels corzesponding to the polynucleotides of the present invention may be 
quantified. Hie transcript levels in the treated biological sample are compared with levels in an 
untreated biological sample. Differences in the transcript levels between the two samples are 

5 indicative of a toxic response caused by the test compound in the treated sample. 

Another particular embodiment relates to the use of the polypeptide sequences of the present 
invention to analyze the proteome of a tissue or cell type. The term proteome refers to the global 
pattern of protein expression in a particular tissue or cell type. Each protein component of a proteome 
can be subjected individually to further analysis. Proteome expression patterns, or profiles, are 

10 analyzed by quantifying the number of expressed proteins and then: relative abundance under given 
conditions and at a given tune. A profile of a cell's proteome may tiius be generated by separating 
and analyzmg the polypeptides of a particular tissue or cell type. In one embodiment, the separation is 
achieved using two-dimensional gel electrophoresis, in which proteins fit)m a sample are separated by 
isoelectric focusing in the first dimension, and then according to molecular weight by sodium dodecyl 

15 sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, supra\ The proteins 
are visualized in the gel as discrete and uniquely positioned spots, typically by staining the gel with an 
a^nt such as Coomassie Blue or silver or fluorescent stains. The optical density of each protem spot 
is generally proportional to the level of the protem in the sample. The optical densities of equivalentiy 
positioned protein spots from different samples, for example, from biological sanoples either treated or 

20 untreated with a test compound or therapeutic agent, are compared to identify any changes in protein 
spot density related to the treatment. The proteins in the spots are partially sequenced using, for 
example, standard methods employing chemical or enzymatic cleavage followed by mass 
spectrometry. The identity of the protein in a spot may be determined by comparing its partial 
sequence, preferably of at least 5 contiguous amino acid residues, to the polypq)tide sequences of the 

25 present invention. In some cases, further sequence data may be obtained for definitive protein 
identification. 

A proteomic profile may also be generated using antibodies specific for MDDT to quantify the 
levels of MDDT expression. In one embodiment, the antibodies are used as elements on a 
microarray, and protein expression levels are quantified by e3q)osing the microarray to the sample and 
30 detecting die levels of protem bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 
270:103-111; Mendoze, L.G. et al. (1999) Biotechniques 27:778-788). Detection may be performed by 
a variety of methods known in the art, for example, by reacting the proteins in the sample with a thiol- 
or amino-reactive fluorescent compound and detecting the amount of fluorescence bound at each 
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array element. 

Toxicant signatures at the proteome level are also useful for toxicological screening, and 
should be analyzed in parallel with toxicant signatures at the transcript level. There is a poor 
correlation between transcript and protein abundances for some proteins in some tissues (Anderson, 

5 Ni. and J. Seilhamer (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be 
useful in the analysis of compounds which do not significantly affect the transcript unage, but which 
alter the proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to rapid 
degradation of mRNA, so proteomic profiling may be more reliable and informative in such cases. 
In another embodiment, die toxicity of a test compound is assessed by treating a biological 

10 sample containing proteins with the test compound. Proteins that are expressed in die treated 

biological sample are separated so that die amount of each protein can be quantified The amount of 
each protein is compared to the amount of the corresponding protein in an untreated biological sample. 
A difference in the amount of protein between the two samples is indicative of a toxic response to the 
test compound in the treated sample. Individual protems are identified by sequencmg the amino acid 

15 residues of the individual proteins and comparing these partial sequences to the polypeptides of the 
present invention. 

la another embodiment, the toxicity of a test conq)ound is assessed by treating a biological 
sample containing proteins with die test compound Proteins from the biological sample are incubated 
widi antibodies specific to the polypeptides of the present invention. The amount of protein recognized 

20 by the antibodies is quantified The amount of protein m the treated biological sample is compared 
widi the amount in an untreated biological sample. A difference m the amount of protein between the 
two samples is indicative of a toxic response to the test compound in the treated sample. 

Microanays may be prepared, used, and analyzed using mediods known in die art. (See, e.g., 
Biennan, TM. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Nafl. Acad. 

25 Sci. USA 93:10614-10619; BaldeschweUa: et al. (1995) PCT appHcation W095/251116; Shalon, D. et 
al. (1995) PCT appHcation WO95/35505; Heller, R.A. et al. (1997) Proc. Nafl. Acad. Sci. USA 
94:2150-2155; and Heller, M J. et al. (1997) U.S. Patent No. 5,605,662.) Various types of 
microarrays are well known and flioroughly described m DNA Microairavs: A Practical Approach. 
M. Schena, ed. (1999) Oxford University Press, London, hereby expressly incorporated by reference. 

30 In another embodiment of die invention, nucleic acid sequences encoding MDDT may be used 

to generate hybridization probes useful in mapping the naturally occurring genomic sequence. Either 
coding or noncoding sequMices may be used, and in some instances, noncoding sequences may be 
preferable over coding sequences. For example, conservation of a coding sequence among members 
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of a multi-gene family may pot^tially cause undesired cross hybridization during chromosomal 
mapping. The sequences may be mapped to a particular chromosome, to a specific region of a 
chromosome, or to artificial chromosome constructions, e.g., hunoian artificial chromosomes (HACs), 
yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI 

5 constructions, or single chromosome cDNA libraries. (See, e.g., Harrington, J.J. et al. (1997) Nat. 
Genet. 15:345-355; Price, CM. (1993) Blood Rev. 7:127-134; and Trask, B.J. (1991) Trends Genet 
7: 149-154,) Once mapped, the nucleic acid sequences of flie invention may be used to develop 
genetic linkage maps, for example, which correlate the inheritance of a disease state with Oie 
inheritance of a particular chromosome region or restriction fragment length polymoiphism (RFLP). 

10 (See, for example, Lander, E.S. and D. Botstein (1986) Proc. Nati. Acad. Sci. USA 83:7353-7357.) 

Fluorescent in situ hybridization (FISH) may be correlated with other physical and genetic 
map data. (See, e.g.. Hemz-Ulrich, et al. (1995) m Meyers, supra, pp. 965-968.) Examples of genetic 
map data can be found in various scientific journals or at the Online Mendelian Inheritance in Man 
(OMIM) World Wide Web site. Correlation between the location of the gene encodmg MDDT on a 

15 physical map and a specific disorder, or a predisposition to a specific disorder, may help define die 
region of DNA associated with that disorder and thus may further positional cloning efforts. 

In situ hybridization of chromosomal preparations and physical mapping techniques, such as ^ 
linkage analysis using established chromosomal markers, may be used for extending genetic maps. 
Often tiie placement of a gene on the chromosome of another mammalian species, such as mouse, 

20 may reveal associated markers even if the exact chromosomal locus is not known. This information is 
valuable to investigators searching for disease genes using positional cloning or other gene discovery 
techniques. Once the gene or genes responsible for a disease or syndrome have been crudely 
localized by genetic linkage to a particular genomic region, e.g., ataxia-telangiectasia to llq22-23, any 
sequences mapping to that area may represent associated or regulatory genes for further investigation. 

25 (See, e.g., Gatti, R.A. et al. (1988) Nature 336:577-580.) The nucleotide sequence of flie instant 
invention may also be used to detect differences in the chromosomal location due to translocation, 
inversion, etc., among normal, carrier, or affected mdividuals. 

In another embodiment of the invention, MDDT, its catalytic or immunogenic fi^agments, or 
oligopeptides thereof can be used for screening libraries of compounds in any of a variety of dmg 

30 screening techniques. The fragment employed in such screening may be free in solution, affixed to a 
solid support, borne on a cell surface, or located mttacellularly . The formation of bmding complexes 
between MDDT and the agent being tested may be measured. 

Another technique for drug screenmg provides for high throughput screening of compounds 
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having suitable binding affinity to the protein of interest (See, e.g., Geysen, et al. (1984) PCT 
application WO84/03564.) In fliis method, large numbers of different small test compounds are 
synthesized on a solid substrate. The test conq)Ounds are reacted with MDDT, or fragments thereof, 
and washed. BoundMDDTisfhendetectedby methods well known in the art Purified MDOT can 
5 also be coated directly onto plates for use in the aforementioned drug screening techniques. 

Alternatively, non-neutralizing antibodies can be used to capture the peptide and immobilize it on a 
solid support. 

In another embodiment, one may use competitive drug screening assays in which neutralizing 
antibodies capable of binding MDDT specifically compete witti a test compound for bmding MDDT. 
10 In this manner, antibodies can be used to detect the presence of any peptide which shares one or more 
antigenic determinants with MDDT. 

In additional embodiments, die nucleotide sequences which encode MDDT may be used in 
any molecular biology techniques that have yet to be developed, provided the new techniques rely on 
properties of nucleotide sequences that are currently known, including, but not limited to, such 
15 prop^es as &e triplet genetic code and specific base pair interactions. 

Wifliout further elaboration, it is believed that one skilled in the art can, using the preceding 
description, utilize die present mvention to its fullest extent. The following embodiments are, therefore, 
to be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way 
whatsoever. 

20 The disclosures of all patents, applications and publications, mentioned above and below, 

including U.S. Ser. No. 60/280.387, U.S. Ser. No. 60/282,335, U.S. Ser. No. 60/283,663, U.S. Ser. 
No. 60/285,484, U.S. Ser. No. 60/350,702, and U.S. Ser. No. 60/351,749, are e3q>ressly incorporated 
by reference herein. 

25 EXAMPLES 
L Construction of cDNA Libraries 

Incyte cDNAs were derived firom cDNA libraries described m the LIFESEQ GOLD 
database (Incyte Genomics, Palo Alto CA). Some tissues were homogenized and lysed in guanidinium 
isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixture of 

30 denaturants, such as TRIZOL (Life Technologies), a monophasic solution of phenol and guanidine 
isofliiocyanate. The resulting lysates were centrifuged over CsQ cushions or extracted wifli 
chloroform. RNA was precipitated from the lysates with either isoprppanol or sodium acetate and 
ethanol, or by other routine methods. 
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Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA 
purity. In some cases, RNA was treated with DNase. For most libraries, poly(A)+ RNA was 
isolated using oligo d(T)-coupled paramagnetic particles (PromegaX OUQOTEX latex particles 
(QIAGEN, Chatsworth CA), or an OUGOTEX mRNA purification kit (QIAGEN). Alternatively, 

5 RNA was isolated directly firom tissue lysates using other RNA isolation kits, e.g., the 
POLY(A)PURE mRNA purification kit (Ambion, Austin TX). 

Iq some cases, Stratagrae was provided with RNA and constructed the corresponding cDNA 
libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the 
UNIZAP vector system (Stratagene) or SUPERSCRIPT plasmid system (Life Technologies), using 

10 the recommended procedures or similar methods known in the art (See, e.g., Ausubel, 1997, supra, 
units S.1-^.6.) Reverse transcription was initiated using oligo d(T) or random primers. Synthetic 
oligonucleotide adapters were ligated to double stranded cDNA, and the cDNA was dieted with tbe 
appropriate restriction enzyme or enzymes. For most libraries, the cDNA was size-selected 
1000 bp) using SEPHACRYL SIOOO. SEPHAROSE (X2B, or SEPHAROSE C3AB column 

15 chromatography (Amersham Pharmacia Biotech) or preparative agarose gel electrophoresis. cDNAs 
were ligated into compatible restriction enzyme sites of the polylinker of a suitable plasmid, e.g., 
PBLUESCRIPT plasmid (Stratagene), PSPORTl plasmid (Life Technologies), Pa)NA2.1 plasmid 
(Invitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), PCR2-TOPOTA plasmid (Invitrogen), 
PCMV-ICIS plasmid (Stratagene), pIGBN (Incyte Genomics, Palo Alto CA), pRARE (Incyte 

20 Genomics), ot pINCY (Incyte Genomics), or derivatives thereof. Recombmant plasmids were 
transformed into competent E. coli cells including XLl-Blue, XLl-BlueMRF, or SOLR iErom 
Stratagene or DH5a, DHIOB, or ElectroMAX DHIOB from Life Technologies, 
n. Isolation of cDNA Oones 

Plasmids obtained as described in Example I were recovered firom host cells by in vivo 

25 excision using tbe UNIZAP vector system (Stratagene) or by cell lysis. Plasmids were purified using 
at least one of the following: a Magic or WIZARD Mmipreps DNA purification system (Promega); an 
AGTC Miniprep purification kit (Edge Biosystems, Gaithersburg MD); and QIAWELL 8 Plasmid, 
QIAWELL 8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.E.Ai. PREP . 
96 plasmid purification kit firom QIAGEN. Following precipitation, plasmids w^ resuspended in 0.1 

30' ml of distilled water and stored, witii or without lyophilization, at 4''C. 

Alternatively, plasmid DNA was amplified from host cell lysates using direct Imk PGR in a 
higji-fliroughput format (Rao, V.B. (1994) Anal. Biochem. 216:1-14). Host cell lysis and fliermal 
cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 
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384-weU plates, and the concentration of amplified plasmid DNA was quantified fluotometrically using 
ncOGREEN dye (Molecular Probes, Eugene OR) and a FLUOROSKAN U fluorescence scanner 
(Labsystems Oy, Helsinki, Finland), 
m. Sequencing and Analysis 

5 Incyte cDNA recovered in plasnuds as described in BTUunple II were sequenced as follows. 

Sequencing reactions were processed using standard mediods or high-throughput instrumentation such 
as the ABI CATALYST 800 (Applied Biosystems) thermal cycler or the PTC-200 Ihermal cycler 
(MJ Research) in conjunction with the HYDRA microdispenser (Robbins Sciratific) or the 
MICROLAB 2200 (Hamilton) liquid transfer system. cDNA sequencing reactions were prepared 

10 using reagents provided by Amersham Pharmacia Biotech or supplied in ABI sequencing kits such as 
the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 
Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides 
were carried out using the MEGABACE 1000 DNA sequencing system (Molecular Dynamics); the 
ABI PRISM 373 or 377 sequencing system (Applied Biosystems) in conjunction with standard ABI 

IS protocols and base calling software; or other sequence analysis systems known in the art. Reading 
frames within the cDNA sequences were identified using standard metiiods (reviewed in Ausubel, 
1997, supra , unit 7.7). Some of the cDNA sequences were selected for extension using the 
techniques disclosed in E?uunple Vm. 

The polynucleotide sequences derived from Incyte cDNAs were validated by r^oving 

20 vector, linker, and poly(A) sequences and by masking ambiguous bases, using algorithms and 

programs based on BLAST, dynanoic programming, and dinucleotide nearest nei^bor analysis. The 
Incyte cDNA sequences or translations th^^eof were then qu^ed against a selection of public 
databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and 
BLOCKS. PRINTS, DOMO, PRODOM; PROTEOME databases witii sequences from Homo 

25 sapiens , Rattus norve^icus, Mus muscnlus . Caenorhabditis ele|gans, Saccharomvces cerevisiae. 

Schizosaccharomvces pombe, and Candida albicans (Incyte Genomics, Palo Alto CA); hidden Markov 
model (HMM)-based protein family databases such as PFAM, INCY, and TIGRFAM (Haft, D.H. et 
al. (2001) Nucleic Acids Res. 29:41-43); and HMM-based protein domain databases such as SMART 
(Schultz et al. (1998) Proc. Natt. Acad. Sci. USA 95:5857-5864; Letunic, L et al. (2002) Nucleic 

30 Acids Res. 30:242-244). (HMM is a probabilistic approach which analyzes consensus primary 

structures of gene families. See, for example, Eddy, S.R. (1996) Curr. Opin. Struct. Biol. 6:361-365.) 
The queries were performed using programs based on BLAST, FASTA, BLIMPS, and HMMER. 
The Incyte cDNA sequences were assembled to produce full length polynucleotide sequences. 
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Alternatively, GenBank cDNAs, GenBankESTs, stitched sequences, stretched sequences, or 
Genscan-piedicted coding sequences (see Examples IV and V) were used to extend hicyte cDNA 
assemblages to full length. Assembly was performed using programs based on Phred, Phrap, and 
Consed, and cDNA assemblages were screened for open reading frames using programs based on 

5 GeneMark, BLAST, and FASTA. The full length polynucleotide sequences were translated to derive 
the corresponding full length polypeptide sequences. Alternatively, a polypeptide of the invention may 
begin at any of the methionme residues of the full length translated polyp^tide. Full lengdi polypeptide 
sequences were subsequently analyzed by querying against databases such as the GenBank protein 
databases (genpept), SwissProt, the PROTEOME databases, BLOCKS, PRINTS, DOMO, 

10 PRODOM, Prosite, hidden Markov model ([IMM)-based protein family databases such as FFAM, 
INC Y, and TIGRFAM; and HMM-based protein domain databases such as SMART. Full length 
polynucleotide sequences are also analyzed using MACDNASIS PRO software (Hitachi Software 
Engineering, South San Francisco CA) and LASBRGENE software (DNASTAR). Polynucleotide 
and polypeptide sequence alignments are generated using default parameters specified by the 

15 CLUSTAL algorithm as incorporated mto the MEGALIGN multisequence alignment program 
(DNASTAR), which also calculates the percent identity between aligned sequences. 

Table 7 summarizes the tools, programs, and algorithms used for the analysis and assembly of 
Incyte cDNA and full length sequences and provides applicable descriptions, refierences, and 
threshold parameters. The first column of Table 7 shows the tools, programs, and algorithms used, the 

20 second column provides brief descriptions th^eof, the third column presents appropriate references, 
all of which are incorpomted by reference herein in their entirety, and the fourth column presents, 
where applicable, the scores, probability values, and oth^ parameters used to evaluate the strength of 
a match between two sequences (the higher the score or the lower the probability value, the greater 
the identity between two sequences). 

25 The programs described above for the assembly and analysis of fiill length polynucleotide and 

polypeptide sequences were also used to identify polynucleotide sequence firagments firom SEQ ID 
NO:24-46. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization and 
amplification technologies are described in Table 4, column 2. 
IV. Identification and Editing of Coding Sequences from Genomic DNA 

30 Putative molecules for disease detection and treatment were initially identified by running die 

Genscan gene identification program against public genomic sequence databases (e.g., gbpri and 
gbhtg). Genscan is a general-purpose gene identification program which analyzes genomic DNA 
sequences from a variety of organisms (See Burge, C. and S. Karlin (1997) J. Mol. Biol. 268:78-94, 
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and Burge, C. and S. Karlin (1998) Cuir. Opin. Struct Biol. 8:346-354). The program concatenates 
. predicted exons to form an assembled cDNA sequence extending from a methionine to a stop codon. 
The output of Genscan is a FASTA database of polynucleotide and polypeptide sequences. The 
maximum range of sequence for Genscan to analyze at once was set to 30 kb. To detennine which of 

5 these Genscan predicted cDNA sequences encode molecules for disease detection and treatment, the 
encoded polypeptides were analyzed by querying against PFAM models for molecules for disease 
detection and treatment. Potential molecules for disease detection and treatment were also identified 
by homology to Incyte cDNA sequences that had been annotated as molecules for disease detection 
and treatment. These selected Genscan-predicted sequences were then compared by BLAST 

10 analysis to the genpept and gibpri public databases. Where necessary, the Genscan-predicted 

sequences were then edited by comparison to the top BLAST hit from genpept to correct errors in the 
sequence predicted by Genscan, such as extra or omitted exons. BLAST analysis was also used to 
find any Incyte cDNA or public cDNA coverage of the Genscan-predicted sequences, thus providing 
evidence for transcription. When Incyte cDNA coverage was available, this information was used to 

15 correct or confirm the Genscan predicted sequence. Full Iragth polynucleotide sequences were 
obtained by assemblmg Genscan-predicted coding sequences witihi Incyte cDNA sequences and/or 
public cDNA sequences using the assembly process described in Example m. Alternatively, full 
length polynucleotide sequences were derived entirely fiom edited or unedited Genscan-predicted 
coding sequences. 

20 V. Assembly of Genomic Sequence Data with cDNA Sequence Data 
^*Stitched" Sequences 

Partial cDNA sequences were extended with exons predicted by the Genscan gene 
identification program described in Example IV. Partial cDNAs assembled as described in Example 
ni were mapped to genomic DNA and parsed into clusters containing related cDNAs and Genscan 

25 exon predictions from one or more genomic sequences. Each cluster was analyzed using an algorithm 
based on graph theory and dynamic programming to integrate cDNA and genomic information, 
generating possible splice variants that were subsequently confirmed, edited, or extended to create a 
full length sequence. Sequence intervals in which the entire length of the interval was present on 
more than one sequence in the cluster were identified, and intervals thus identified were considered to 

30 be equivalent by transitivity. For example, if an interval was present on a cDNA and two genomic 
sequences, then all three intervals were considered to be equivalent. This process allows unrelated 
but consecutive genomic sequences to be brought together, bridged by cDNA sequence. Intervals 
thus identified were flien "stitched" together by the stitching algorithm in the order ttiat they appear 
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along their parent sequences to generate the longest possible sequence, as well as sequence variants. 
Linkages between intervals which proceed along one type of parent sequence (cDNA to cDNA or 
genomic sequence to genonric sequence) were given preference over linkages which change parent 
type (cDNA to genomic sequence). Hie resultant stitched sequences were translated and compared 
5 by BLAST analysis to the genpept and gbpri public databases. Incorrect exons predicted by Genscan 
were corrected by comparison to the top BLAST hit from genpept Sequences were further extended 
wifli additional cDNA sequences, or by inspection of genomic DNA, when necessary. 
**Stretched'* Sequences 

Partial DNA sequences were extended to full length with an algorithm based on BLAST 

10 analysis. First, partial cDNAs assembled as described in Example m were queried against public 
databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases 
usmg die BLAST program. The nearest GenBank protein homolog was then compared by BLAST 
analysis to either Incyte cDNA sequences or GenScan exon predicted sequences described in 
Example IV. A chimeric protein was generated by using the resultant high-scoring segment pairs 

IS (HSPs) to map the translated sequences onto the GenBank protein homolog. Insertions or deletions 
may occur in the chimeric protem widi respect to the original GenBank protein homolog. The 
GenBank protein homolog, the chimeric protein, or both were used as probes to search for homologous 
genomic sequences from die public human genome databases. Partial DNA sequences were 
therefore "stretched" or extended by the addition of homologous genomic sequences. The resultant 

20 stretched sequences were examined to determine whether it contained a complete gene. 
VI. Chromosomal Mapping of MDDT Encoding Polynucleotides 

The sequences which were used to assemble SEQ ID NO:24-46 were compared widi 
sequences from die Incyte LiFESEQ database and public domain databases using BLAST and odier 
implementations of the Smith-Waterman algorithm. Sequenpes from these databases that matched 

25 SEQ ID NO:24-46 were assembled into clusters of contiguous and overlapping sequences using 
assembly algorifluns such as Phrap (Table 7). Radiation hybrid and genetic mapping data available 
from public resources such as flie Stanford Human Genome Center (SHGQ, Whitehead Institute for 
Genome Research (WIGR), and G6n6tiion were used to determine if any of die clustered sequences 
had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in the assignment 

30 of all sequences of that cluster, including its particular SEQ ID NO:, to that map location. 

Map locations are represented by ranges, or intervals, of human chromosomes. The map 
position of an interval, in centiMorgans, is measured relative to the ^minus of the chromosome's p- 
arm. (The centiMorgan (cM) is a unit of measurement based on recombination irequencies between 
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chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase ^4b) of DNA in 

humans, although this can vary widely due to hot and cold spots of recombination.) The cM distances 

are based on genetic markers mapped by G6n£thon which provide boundaries for radiation hybrid 

markers whose sequences were included in each of the clusters. Human genome maps and other 
5 resources available to tht public, such as the NCBI "GeneMap*99" World Wide Web site 

(http://www.ncbi.nlm.nih.gov/genemap/)f can be employed to determine if previously identified disease 

genes map within or in proximity to the intervals mdicated above. 

y n. Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a 
10 gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 

from a particular cell type or tissue have been bound. (See, e.g., Sambrook, supra, ch. 7; Ausubel 

(1995) supra , ch. 4 and 16.) 

Analogous computer techniques applying BLAST were used to search for identical or related 

molecules in cDNA databases such as GenBank or UFESEQ (Incyte Genomics). This analysis is 
15 much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer 

search can be modified to determine whether any particular match is categorized as exact or similar. 

The basis of the search is the product score, which is defined as: 

BLAST Score x Percent Identitv 
20 5 X minimum { length(Seq. 1), length(Seq. 2) } 

The product score takes into account both the degree of similarity between two sequences and the 
length of the sequence match. The product score is a normalized value between 0 and 1(M), and is 
calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the 

25 product is divided by (5 times the lengfli of flie shorter of the two sequences). The BLAST score is 
calculated by assigning a score of +5 for every base tfiat matches in a high-scoring segmrat pair 
(HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by 
gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate 
the product score. The product score represents a balance between fiactional overlap and quality in a 

30 BLAST alignment For example, a product score of 100 is produced only for 100% identity ov^ the 
entire lengtii of the shorter of the two sequences being compared. A product score of 70 is produced 
either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the 
other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% 
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identity and 100% overlap. 

Alternatively, polynucleotide sequences encoding MDDT are analyzed with respect to the 
tissue sources from which they were derived For example, some full lengdi sequences are 
assembled, at least in part, with overlapping Incyte cDNA sequences (see Example HI). Each cDNA 
S sequence is derived from a cDNA library constructed from a human tissue. Each human tissue is 
classified into one of the following organ/tissue categories: cardiovascular system; connective tissue; 
digestive system; embryonic structures; endocrine system; exocrine glands; ^nitalia, female; genitalia, 
male; germ cells; hemic and inmiune system; liver; musculoskeletal system; nervous system; 
pancreas; respiratory system; sense organs; skin; stomatognatiiic system; unclassified/mixed; or 

10 urinary tract. The number of libraries in each category is counted and divided by die total number of 
Ubraries across all categories. Similarly, each human tissue is classified into one of the following 
disease/condition categories: cancer, cell line, developmental, inflammation, neurological, trauma, 
cardiovascular, pooled, and otiier, and the number of libraries in each category is counted and divided 
by the total number of libraries across all categories. The resulting percentages reflect the tissue- and 

IS disease-specific expression of cDNA encoding MDDT. cDNA sequences and cDNA library/tissue 
infonnation are found in flie UFESEQ GOLD database (Incyte Genomics, Palo Alto CA). 
Vm. Extension of MDDT Encoding Polynucleotides 

Full length polynucleotide sequences weie also produced by extension of an appropriate 
fragment of the full length molecule using oligonucleotide primers designed from this fragment - One 

20 primer was synthesized to initiate S' extension of tiie known fragment, and the other primer was 
synthesized to initiate 3' extension of the known fragment The initial primers were designed using 
OUGO 4.06 software (National Biosciences), or another appropriate program, to be about 22 to 30 
nucleotides in length, to have a GC content of about 50% or more, and to anneal to the target 
sequence at temperatures of about es^'C to about 72*'C, Any stretch of nucleotides which would 

25 result in hairpin structures and primer-primer dimerizations was avoided. 

Selected human cDNA libraries were used to extend the sequence. If more than one 
extension was necessary or desned, additional or nested sets of prim^ were designed. 

Ifigh fidelity amplification was obtained by PCR usmg metiiods well known in the art. PGR 
was performed in 96-well plates using the PTC-200 thermal cycler (NO Research, Inc.). The reaction 

30 mix contained DNA template, 200 nmol of each primer, reaction buffer containing Mg^, Q^^SO^, 
and 2-mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONOASE 
enzyme (Life Technologies), and Pfu DNA polymerase (Stratagene), with flie following parameters 
for primer pair PCI A and PCI B: Step 1: 94*C, 3 min; Step 2: 94*C, 15 sec; Step 3: 60**C, 1 min; 
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Step 4: eS'^C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: eS^C, 5 min; Step 7: storage 
at 4''C. In ttie alternative, the parameters for primer pair T7 and SK+ were as follows: Step 1: 94 ''C, 
3 min; St&p 2: 94°C, 15 sec; Step 3: 5TC, 1 min; Step 4: 68**C, 2 min; Step 5: Steps 2, 3, and 4 
repeated 20 times; Step 6: 68°C, 5 min; Step 7: storage at 4**C. 

5 The concentration of DNA in each well was determined by dispensing 100 /il PICOGREEN 

quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Probes, Eugene OR) dissolved in IX TB 
and 0.5 ptl of undiluted PGR product into each well of an opaque fluorimeter plate (Coming Costar, 
Acton MA), allowing the DNA to bind to the reagent The plate was scanned in a Euoroskan n 
(Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 

10 concentration of DNA. A 5 /^l to 10 /il aliquot of the reaction mixture was analyzed by 

electrophoresis on a 1 % agarose gel to determine which reactions were successful in extending the 
sequence. 

Hie extended nucleotides were desalted and concentrated, transferred to 384-well plates, 
digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 

15 sonicated or sheared prior to religation into pUC 18 vector (Amersham Pharmacia Biotech). For 
shotgun sequencing, the digested nucleotides were separated on low concentration (0.6 to 0.8%) 
agarose gels, fragments were excised, and agar digested widi Agar ACE (Promega). Extended 
clones were religated using T4 ligase (New England Biolabs, Beverly MA) into pUC 18 vector 
(Amersham Pharmacia Biotech), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction 

20 site overhangs, and transfected into competent E. coli cells. Transformed cells were selected on 

antibiotic-containing media, and individual colonies were picked and cultured ovemight at 37^0 in 384- 
well plates in LB/2x carb liquid media. 

Hie cells were lysed, and DNA was an]4>lified by PCR usmg Taq DNA polymerase 
(Amersham Pharmacia Biotech) and Pfo DNA polymerase (Stratagene) with the following 

25 parameters: Step 1: 94''C, 3 min; Step 2: 94*'C. 15 sec; Stsp 3: 60*^0, 1 min; Step 4: 72*"^ 2 min; Step 
5: steps 2, 3, and 4 repeated 29 times; Step 6: 72''C, 5 min; Step 7: storage at 4*'C. DNA was 
quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA 
recoveries were reamplified using the same conditions as described above. Saiiq)les were diluted with 
20% dunetiiysulfoxide (1:2, v/v), and sequenced using DYENAMIC energy transfer sequencing 

30 primers and the DYENAMIC DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM 
BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 

In like manner, full length polynucleotide sequences are verified using the above procedure or 
are used to obtain 5' regulatory sequences usmg the above procedure along with oligonucleotides 
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designed for such extension, and an appropriate genomic library. 

IX. Identification of Single Nudeotide Polymorphisms in MDDT Encoding 
Polynucleotides 

Common DNA sequence variants known as single nucleotide polymorphisms (SNPs) were 
S identified in SEQ ID NO:24-46 using the UFESEQ database (hicyte Genomics). Sequences from the 
same gene were clustered together and assembled as described in Example m, allowing the 
identification of all sequence variants in the gene. An algorithm consisting of a series of filters was 
used to distinguish SNPs from other sequence variants. Preliminary filters removed the majority of 
basecall errors by requiring a minimum Phred quality score of 1S» and removed sequence alignment 
10 errors and errors resulting from improper tnmming of vector sequ^ces, chimeras, and splice variant. 
An automated procedure of advanced chromosome analysis analysed the origmal chromatogram files 
in the vicinity of the putative SNP. Clone error filters used statistically generated algorithms to identify 
errors introduced during laboratory processmg, such as those caused by reverse transcriptase, 
polymerase, or somatic mutation. Clustering error filters used statistically generated algorithms to 
15 identify errors resulting fix>m clustering of close homologs or pseudogenes, or due to contamination by . 
non-human sequences. A final set of filters removed duplicates and SNPs found in immunoglobulins 
orT-ceU receptors. 

Certain SNPs were selected for further characterization by mass spectrometry usmg the high 
throughput MASSARRAY system (Sequenom, Inc.) to analyze allele frequencies at die SNP sites in 

20 four different human populations. The Caucasian population comprised 92 individuals (46 male, 46 
female), including 83 from Utah, four French, fluee Venezualan, and two Amish individuals. Tlie 
African population comprised 194 individuals (97 nude, 97 female), aU African Americans. Hie 
Hispanic population conq)rised 324 individuals (162 male, 162 female), all Mexican Hispanic. The 
Asian population comprised 126 mdividuals (64 male,"62 female) with a reported parental breakdown 

25 of 43% Chinese, 31% Japanese, 13% Korean, 5% Vietnamese, and 8% other Asian. Allele 

frequencies were first analyzed in the Caucasian population; in some cases those SNPs which showed 
no allelic variance in this population were not further tested in the other three populations. 

X. Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO:24-46 are employed to screen cDNAs, 
30 genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base 
pairs, is specifically described, essentially the same procedure is used with larger nucleotide 
fragments. Oligonucleotides are designed using state-of-the-art software such as OUGO 4.06 
software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 /uCi of 
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[y-^^] adenosine triphosphate (Amersham Phannacia Biotech), and T4 polynucleotide kinase 
(DuPont NEN, Boston MA). The labeled oligonucleotides are substantially purified using a 
SEPHADEX G-2S superfine size exclusion dextran bead colunm (Amersham Phannacia Biotech). 
An aliquot containing 10^ counts per minute of the labeled probe is used in a typical membiane-based 

5 hybridization analysis of human genomic DNA digested with one of the following endonucleases: Ase 
I, Bgl II, Eco RI, Pst I, Xba I. or Pvu n (DuPont NEN). 

The DNA firom each digest is fractionated on a 0.7% agarose gel and transferred to nylon 
membranes G^ytran Plus, Schleicher & Schuell, Durham NH). Hybridization is carried out for 16 
hours at 40''C. To remove nonspecific signals, blots are sequentially washed at room temperature 

10 under conditions of up to, for example, 0.1 x salme sodium citrate and 0.5% sodium dodecyl sulfate. 
Hybridization patterns are visualized using autoradiography or an alternative imaging means and 
compared. 

XI. Microarrays 

The linkage or synthesis of array elements upon a microarray can be achieved utilizing 

15 photolithography, piezoelectric printing (ink-jet printing. See, e.g., Baldeschweiler, supra .\ mechanical 
microspotting technologies, and derivatives thereof. The substrate in each of the aforementioned 
technologies should be uniform and solid with a non-porous surface (Schena (1999), supra) . 
Suggested substrates include silicon, silica, glass slides, glass chips, and silicon wafers. Alternatively, a 
procedure analogous to a dot or slot blot may also be used to arrange and link elements to the surface 

20 of a substrate usmg thermal, UV, chemical, or mechanical bonding procedures. A typical array may 
be produced using available methods and machines well known to those of ordinary skill in the art and 
may contain any appropriate number of elements. (See, e.g., Schena, M. et al. (1995) Science 
270:467-470; Shalon, D. et al. (1996) Genome Res. 6:639-645; Marshall, A. and J. Hodgson (1998) 
Nat. BiotechnoL 16:27-31.) 

25 Full length cDNAs, Expressed Sequence Tags (ESTs), or fiagments or oligomers thereof may 

comprise the elements of the microairay. Fragments or oligomers suitable for hybridization can be 
selected using software well known in flie art such as LASERGENE software (DNASTAR). The 
array elements are hybridized with polynucleotides in a biological sample. The polynucleotides in the 
biological sample are conjugated to a fluorescent label or other molecular tag for ease of detection. 

30 After hybridization, nonhybridized nucleotides from the biological sample are removed, and a 
fluorescence scann^ is used to detect hybridization at each array element. Alternatively, laser 
desorbtion and mass spectrometry may be used for detection of hybridization. The degree of 
complementarity and the relative abundance of each polynucleotide which hybridizes to an element on 
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the microairay may be assessed. In one embodiment, microarray pi^aration and usage is described 
in detail below. 

Tissue or Cell Sample Preparatlop 

Total BNA is isolated from tissue samples using the guanidinium thiocyanate method and 
5 poly(A)* RNA is purified using the oligo-(dT) cellulose method. Each poly(A)* RNA sample is 
reverse transcribed using MMLV reverse-transcriptase, 0.05 pg//il oligo-(dT) primer (21mer), IX first 
strand buffer, 0.03 units//il RNase inhibitor, 500 jxM dATP, 500 dGTP, 500 ixM dTIP, 40 jiM 
dCTP, 40 iiM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Phannacia Biotech). The reverse 
transcription reaction is performed in a 25 ml volume containing 200 ng poly(A)^ RNA with 

10 GEMBRIGHT kits (Incyte). Specific control poly(A)* RNAs are synthesized by in vitro transcription 
from non-coding yeast genomic DNA. After mcubation at 37° C f or 2 hr, each reaction sample (one 
with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and 
incubated for 20 minutes at 85° C to die stop the reaction and degrade the RNA. Samples are purified 
using two successive CHROMA SPIN 30 gel filtration spm columns (CLONTBCH Laboratories, hic. 

15 (CXONTECH), Palo Alto CA) and after combining, both reaction sanq)les are ethanol precipitated 
usmg 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The sample is 
then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbiook NY) and resuspended 
ml4/Al5XSSC/0.2%SDS. 
Microarray Preparation 

20 Sequences of the present invention are used to generate array elements. Each array element 

is amplified from bacterial cells containing vectors with cloned cDNA ms^. PCR amplification uses 
primers complementary to the vector sequences flanking the cDNA ins^ Array elements are 
amplifiedinthuly cycles of PCR from an iriitial quantity of 1-2 ng to a final quantity greater than5 fig. 
Amplified array elements are tfien purified using SEPHACRYL-400 (Amersham Phannacia Biotech). 

25 Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 

slides (Coming) are cleaned by ultrasound m 0.1% SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
Scientific Products Corporation (VWR), West Chester PA), washed extensively in distilled water, and 
coated with 0.05% aminopropyl silane (Sigma) m 95% ethanol. Coated slides are cured in a 1 10^ 

30 oven. 

Array elements are applied to the coated glass substrate using a procedure described m U.S. 
Patent No. 5,807,522, incorporated herein by reference. 1 jiil of the array element DNA, at an average 
concentration of 100 ng(^l, is loaded into the open capillary printing element by a high-speed robotic 
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apparatus. The apparatus then deposits about S nl of array element sample per slide. 

Microarrays are UV-crosslinked using a STRATAUNKER UV-crosslinker (Stratagene). 
Microairays are washed at room temperatuie once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
5 buffered saline (PBS) (Tropix, Inc., Bedford MA) for 30 minutes at 60^C followed by washes in 0.2% 
SDS and distilled water as before. 
HyhridiTfltinii 

Hybridization reactions contain 9 ill of sample mixture consisting of 0.2 ^g each of Cy3 and 
Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The sample 

10 mixture is heated to 65^ C for S minutes and is aliquoted onto the microarray surface and covered with 
an 1 .8 cm* coverslip. The anays are transferred to a waterproof chamber having a cavity just sligjitly 
larger than a microscope slide. The chamber is kept at 100% humidity internally by die addition of 140 
^1 of SX SSC in a comer of the chamber. The chamber containing die arrays is incubated for about 
6.5 hours at 60*»C. The arrays are washed for 10 min at 45**C in a first wash buffer (IX SSC, 0.1% 

15 SDS), three times for 10 minutes each at 45»C in a second wash buffer (O.IX SSC), and dried. 
Detection 

Reporter-labeled hybridization complexes are detected with a microscope equipped with an 
Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light is 

20 focused on the array using a 20X microscope objective (Nikon, Iqc, Melville NY). The slide 
containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- 
scanned past the objective. The 1.8 cm x 1.8 cm array used in the present example is scanned with a 
resolution of 20 micrometers. 

In two separate scans, a mixed gas multiline laser excites the two fluorpphores sequentially. 

25 Emitted ligjit is split, based on wavelengfli, into two photomultiplier tobe detectors (PMT R1477, 

Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate 
filths positioned between the array and die photomultiplier mbes are used to filter die signals. The 
emission maxima of the fluorophores used are 565 mn for Cy3 and 650 nm for Cy5. Each array is 
typically iscanned twice, one scan per fluorophore using the appropriate filters at the laser source, 

30 aldiough die apparatus is capable of recording the spectra fi:om both fluorophores simultaneously. 

Hie sensitivity of the scans is typically calibrated using the signal intensity generated by a 
cDNA control species added to the sample mixture at a known concentration. A specific location on 
the array contains a complementary DNA sequence, allowing the intensity of the signal at that location 
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to be correlated with a weight ratio of hybridizing species of 1:100,000. When two samples from 
diff^nt sources (e.g,, representing test and control cells), each labeled with a different fluorophore, 
are hybridized to a single array for the purpose of identifying genes that are differentially expressed, 
the calibration is done by labeling samples of the calibrating cDNA with the two fluorophores and 
5 adding identical amounts of each to the hybridization mixture. 

The ou^t of the photomultiplier tube is digitized using a 12-bit RTI-835H analog-to-digital 
(A/D) conversion board (Analog Devices, Inc., Norwood MA) installed m an IBM-compatible PC 
computer. The digitized data are displayed as an image where the signal intensity is mapped using a 
linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high 

10 signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and 
measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping emission 
spectra) between the fluorophores using each fluorophore's emission spectrum. 

A grid is superimposed over the fluorescence signal image such that the signal from each spot 
is centered in each element of the grid. The fluorescence signal within each element is then integrated 

IS to obtain a numerical value corresponding to the average intensity of the signal. The software used 
for signal analysis is the GEMTOOLS gene expression analysis program (hicyte). 
Expression 

TNF-g treatment of HAEC cultures 

HAECs were mamtained in EGM-2 medium (Clonetics, San Diego CA) containmg 2% FBS, 

20 recombinant hEGF (0.5 ng.ml-'), Gentamicin (50 Mg ml O, and Amphotericin-B (50 ng.ml O (as 
supplied by Clonetics), at 37**C in a 5% COj atmosphere. In addition, hydrocortisone, VEGF, R3- 
IGF-1, ascorbic acid, hPGF-B, and heparin were included in the medium according to manufacturer's 
instruction (Clonetics). Hiecells wctc grown to 85% confluency and then treated with TNF-a (10 
ng.ml*0 for 1, 2. 4, 6, 8, 10, 24, and 48 hours. These TNF-a treated cells were compared to untreated 

25 HAECs collected at 85% confluency (t = 0 hour). 

For SEQ ID NO:38, the expression of a component of this polynucleotide sequence, having 
Incyte clone ID 2662817, is downregulated by at least two-fold when treated wifli TNF-a in tiiree 
primary endothelial cell lines, HAEC, HIAEC, and HUVEC. lacyte clone ID 2662817 spans 
nucleotides 474 through 1176 of Incyte polynucleotide 2457335CB1 (SEQ ID NO:38). 

30 xn. Complementary Polynucleotides 

Sequences complementary to die MDDT-encoding sequences, or any parts tii^of, are used 
to detect, decrease, or inhibit expression of naturally occurring MDDT. Although use of 
oligonucleotides comprising from about IS to 30 base pairs is described, essentially die same 
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procedure is used with smaller or with larger sequence fragments. Appropriate oligonucleotides are 
designed using OLIGO 4.06 software (National Biosciences) and the coding sequence of MDDT. To 
inhibit transcription, a complementary oligonucleotide is designed from the most unique 5' sequence 
and used to prevent promoter bmding to the coding sequence. To inhibit translation, a complementary 

5 oligonucleotide is designed to prevent ribosomal binding to the MDDT-encoding transcript 
Xm. Expression of MDDT 

Expression and purification of MDDT is achieved using bacterial or virus-based expression 
systems. For expression of MDDT in bacteria, cDNA is subcloned into an appropriate vector 
containing an antibiotic resistance gene and an inducible promoter that duects high levels of cDNA 

10 transcription. Examples of such promoters include, but are not limited to, flie trp-lac (tac) hybrid 
promoter and die T5 or T7 bact^ophage promoter in conjunction with the lac operator regulatory 
element. Recombinant vectors are transformed mto suitable bacterial hosts, e.g., BL21(DE3). 
Antibiotic resistant bacteria express MDDT upon induction with isopropyl beta-D- 
thiogalactopyranoside (BPTG). Expression of MDDT in eukaryotic cells is achieved by infecting 

15 insect or mammalian cell lines with recombinant Autoeraphica ^^ fifnmira nuclear polyhedrosis virus 
(AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is 
replaced with cDNA encoding MDDT by either homologous recombination or bacterial-mediated 
transposition involving transfer plasnaid intermediates. Viral mfectivity is maintained and the strong 
polyhedrin promoter drives high levels of cDNA transcription. Recombinant baculovfatis is used to 

20 infect Snodoptera ftnpiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases. 
Infection of the latt^ requires additional genetic modifications to baculovirus. (See Engelhard, E.K. et 
al. (1994) Proc, Natl. Acad. Sci. USA 91:3224-3227; Sandig. V. et al. (1996) Hum. Gene TTier. 
7:1937-1945.) 

In most expression systems, MDDT is synthesized as a fusion protein with, e.g., glutathione S- 
25 transferase (GST) or a peptide epitope tag, such as BLAG or 6-His, permitting rapid, single-step, 

afGnity-based purification of recombinant fusion protem from crude cell lysates. GST, a 26-kilodalton 
enzyme from SchifitoRnma japonicnm. enables the purification of fusion proteins on immobilized 
glutathione under conditions that maintain protein activity and antigenicity (Amersham Pharmacia 
Biotech). Following purification, the GST moiety can be proteolytically cleaved from MDDT at 
30 specifically enguieered sites. FLAG, an 8-amino acid peptide, enables immunoafGnity purification 
using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 6- 
BQs, a stretoh of six consecutive histi*ne residues, enables purification on metal-chelate resins 
(QIAGEN). Methods for protein expression and purification are discussed in Ausubel (1995, supra. 
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ch. 10 and 16). Purified MDDT obtained by these methods can be used directly in die assays shown 
in Examples XVII, XVIII, and XEX, where applicable. ' 

XIV . Functional Assays 

MDDT function is assessed by expressing the sequences encoding MDDT at physiologically 

5 elevated levels in mammalian cell culture systems. cDNA is subcloned into a mammalian expression 
vector containing a strong promoter tiiat drives high levels of cDNA expression. Vectors of choice 
mclude PCMV SPORT (Life Technologies) and PCR3.1 (Invitrogen, Carlsbad CA), both of which 
contain the cytomegalovirus promoter. 5-10 fxg of recombinant vector are transiendy transfected into 
a human cell line, for example, an endothelial or hematopoietic cell line, using either liposome 

10 formulations or electroporation. 1-2 jLtg of an additional plasmid containing sequences encoding a 
marker protein are co-transfected Expression of a marker protein provides a means to distinguish 
transfected cells from nontransfected cells and is a reliable predictor of cDNA expression from the 
recombinant vector. Marker proteins of choice include, e.g.. Green Fluorescent Protein (GFP; 
Clontech), CD64, or a CD64-GFP fusion protein. How cytometry (FCM), an automated, laser optics- 

15 based technique, is used to identify transfected cells expressing GFP or CD64<jFP and to evaluate 
die apoptotic state of the cells and otiier cellular properties. FCM detects and quantifies the uptake of 
fluorescent molecules that diagnose events preceding or coincident widi cell deadi. These events 
mclude changes in nuclear DNA content as measured by staming of DNA with propidium iodide; 
changes in cell size and granularity as measured by forward ligjht scatter and 90 degree side light 

20 scatter; down-regulation of DNA synthesis as measured by decrease in bromodeoxyuridine uptake; 
alterations in expression of cell surface and intracellular proteins as measured by reactivity with 
specific antibodies; and alterations in plasma membrane composition as measured by the binding of 
fluorescein-conjugated Annexin V protem to the cell surface. Methods m flow cytometry are 
discussed m Ormecod, M.G. (1994) Flow Cvtometry. Oxford, New York NY. 

25 The influence of MDDT on gene e?qpression can be assessed using highly purified populations 

of cells transfected with sequeaices encoding MDDT and either CD64 or CD64-GFP. CD64 and 
CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human 
immunoglobulin O (IgG). Transfected cells are efficiendy separated from nontransfected cells using 
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake Success 

30 NY). mRNA can be purified from die cells using mediods well known by those of skiU in the art. 
Expression of mRNA encoding MDDT and other genes of interest can be analyzed by northern 
analysis or microarray techniques. 

XV. Production of MDDT Specific Antibodies 
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NBDDT substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 
Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 
immunize animals (e.g., rabbits, mice, etc.) and to produce antibodies using standard protocols. 

Alternatively, the MDDT amino acid sequence is analyzed using LASERGENE software 
5 (DNASTAR) to determine regions of high immunogenicity, and a corresponding oligopeptide is 
syntiiesized and used to raise antibodies by means known to those of skill in the art. Mefliods for 
selection of appropriate epitopes, such as those near the C-tanninus or in hydrophilic regions are well 
described in the art. (See, e.g., Ausubel, 1995, supra , ch, 11.) 

Typically, oligopeptides of about 15 residues in length are synthesized usmg an ABI 431A 
10 peptide synthesizer (Applied Biosystems) usmg FMOC chemistry and coupled to KLH (Sigma- 
Aldrich, St. Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to 
increase immunogenicity. (See. e.g., Ausubel. 1995. supra .) Rabbits are immunized witii tiie 
oligopeptide-KLH complex in complete Reund's adjuvant. Resultmg antisera are tested for 
antipeptide and anti-MDDT activity by, for example, binding the peptide or MDDT to a substrate, 
15 blocking with 1 % BSA, reacting with rabbit antisera, washing, and reacting with radio-iodinated goat 
anti-rabbit IgG. 

XVI. Purification of Naturally Occurring MDDT Using Specific Antibodies 

Naturally occurring or recombmant MDDT is substantially purified by immunoaffinity 

chromatography using antibodies specific for MDDT. An immunoaffinity column is constructed by 
20 covalenfly coupling anti-MDDT antibody to an activated chromatographic resin, such as 

CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After tfie coupling, the resin is 

blocked and washed according to the manufacture's instructions. 

Media containing MDDT are passed over the immunoaffinity column, and the column is 

washed under conditions that allow the preferential absorbance of MDDT (e.g., high ionic strengtii 
25 buffers in the presence of detergent). The column is eluted under conditions that dismpt 

antibody/MDDT bmding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrppe, such 

as urea or tiiiocyanate ion), and MDDT is collected. 

XVn. Identification of Molecules Which Interact with MDDT 

MDDT, or biologically activp firagments ttiereof , are labeled witii ^^I Bolton-Hunter reagent. 
30 (See, e.g., Bolton, A.E. and WM. Hunter (1973) Biochem. J. 133:529-539.) Candidate molecules 

previously arrayed in the wells of a multi-well plate are incubated witii tiie labeled MDDT, washed, 

and any wells witii labeled MDDT complex are assayed. Data obtained using different concentrations 

of MDDT are used to calculate values for the number, affinity, and association of MDDT witii tiie 
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candidate molecules. 

Alternatively, molecides interacting with MDDT are analyzed using the yeast two-hybrid 
system as described in Fields, S. and O, Song (1989) Nature 340:245-246, or using commercially 
available kits based on the two-hybrid system, such as the MATCHMAKER system (Clontech). 
5 MDDT may also be used m the PATHCALLING process (CuraGen Corp., New Haven CT) 

which employs the yeast two-hybrid system in a higji-throughput manner to determine all interactions 
between the proteins encoded by two large libraries of genes (Nandabalan, K et al. (2000) U.S. 
Patent No. 6,057,101). 
XVm. Demonstration of MDDT Activity 

10 A microtubule motility assay for MDDT measures motor protein activity. In fliis assay, 

recombinant MDDT is immobilized onto a glass slide or similar substrate. Taxol-stabilized bovine 
brain microtubules (commercially available) in a solution containing ATP and cytosolic extract are 
perfused onto the slide. Movement of microtubules as driven by MDDT motor activity can be 
visualized and quantified using video-enhanced lig^t microscopy and image analysis techniques, 

15 MDDT activity is directiy proportional to the frequency and velocity of microtubule movCTient 
Alternatively, an assay for MDDT activity measures the formation of protein filaments in 
vitro. A solution of MDDT at a concentration greater than the "critical concentration" for polymer 
assembly is applied to carbon-coated grids. Appropriate nucleation sites may be siq)plied in the 
solution. The grids are negatively stained with 0.7% (w/v) aqueous uranyl acetate and examined by 

20 electron microscopy. Hie appearance of filaments of approximately 25 nm (microtubules), 8 nm 
(actin), or 10 nm (intermediate filaments) is a demonstration of MDDT activity. 

In another alternative, MDDT activity is measured by the binding of MDDT to protein 
filaments. ^*S-Met labeled MDDT sample is incubated with the appropriate filament protein (actin, 
tubulin, or intermediate filament protein) and complexed protein is collected by immunoprecipitation 

25 using an antibody against the filament protein. Hie immunoprecipitate is then run out on SDS-PAGE 
and the amount of MDDT bound is measured by autoradiography. 

MDDT activity is measured by its ability to stimulate transcription of a reporter gene (Liu, 
H.Y. et al. (1997) EMBO J. 16:5289-5298). The assay entails flie use of a weU characterized 
reporter gene construct, LexA^-LacZ, that consists of LexA DNA transcriptional control elements 

30 (LexA^) fused to sequences encoding the E. coli LacZ enzyme. The methods for constructing and 
expressing fusion genes, introducing them into cells, and measuring LacZ enzyme activity, are well 
known to those skilled in the art. Sequences encoding MDDT are cloned into a plasmid that directs 
the synfliesis of a fusion protein, LexA-MDDT, consisting of MDDT and a DNA binding domain 
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derived from the LexA transcription factor. The resulting plasmid, encoding a LexA-MDDT fusion 
protein, is introduced into yeast cells along with a plasmid containing the LexA^-LacZ reporter gene. 
The amount of LacZ enzyme activity associated witii LexA-MDDT transfected cells, relative to 
control cells, is proportional to the amount of transcription stimulated by the MDDT. 

5 Alternatively, MDDT activity is measured by its ability to bind zinc. A 5-10 mM sample 

solution in 2.S mM ammonium acetate solution at pH 7.4 is combined with O.OS M zinc sulfate solution 
(Aldrich, Milwaukee WI) in flie presence of 100 mM difliiothreitol wifli 10% methanol added. The 
sample and zinc sulfate solutions are allowed to incubate for 20 minutes. The reaction solution is 
passed througji a VYDAC column (Grace Vydac, Hesperia, CA) with approximately 300 Angstrom 

10 bore size and 5 mM particle size to isolate zinc-sample complex from the solution, and into a mass 
spectrometer (PE Sciex, Ontario, Canada). Zinc bound to sample is quantified using the functional 
atomic mass of 63,5 Da observed by Whittal, R. M. et al. ((2000) Biochemistry. 39:8406-8417). 

In the alternative, a method to determine nucleic acid bmding activity of MDDT involves a 
polyacrylamide gel mobility-shift assay. In preparation for fliis assay, MDDT is expressed by 

15 transformmg a mammalian cell Ime such as C0S7, HdLa or CHO witii a eukaryotic expression vector 
containing MDDT cDNA. Hie cells are incubated for 48-72 hours after transformation under 
conditions appropriate for the cell line to allow expression and accumulation of MDDT. Extracts 
containing solubilized proteins can be prepared from cells expressing MDDT by methods well known 
in the art. Portions of flie extract containing MDDT are added to p^]-labeled RNA or DNA. 

20 Radioactive nucleic acid can be synthesized in vitro by techniques well known in the art. The mixtures 
are incubated at 25*'C in the presence of RNase- and DNase-mhibitors under buffered conditions for 
5-10 minutes. After incubation, the samples are analyzed by polyacrylamide gel electrophoresis 
followed, by autoradiography. The presence of a band on the autoradiogram indicates tiie formation of 
a complex between MDDT and the radioactive transcript. A band of similar mobility will not be 

25 present m samples prepared using control extracts prepared from untransformed cells. 

In the alternative, a method to determine methylase activity of MDDT measures transfer of 
radiolabeled metiiyl groups between a donor substrate and an acceptor substrate. Reaction mixtures 
(50 III final volume) contain 15 mM HEPES, pH 7.9, 1.5 mM MgClj, 10 mM didiiotiireitol, 3% 
polyvinylalcohol, 1.5 fiCi [methyl-^AdcMet (0.375 ^M AdoMet) (DuPont-NEN), 0.6 /ig MDOT, 

30 and acceptor substrate (e.g., 0.4 jxg ["S]RNA, or 6-merciq)topurine (6-MP) to 1 mM final 

concentration). Reaction mixtures are incubated at 30°C for 30 minutes, then 65*^0 for 5 minutes. 

Analysis of [meAyPHJRNA is as follows: (1) 50 jxl of 2 x loading buffer (20 mM Tris-HCl, 
pH 7.6, 1 M LiCl, 1 mM EDTA. 1% sodium dodecyl sulphate (SDS)) and 50 /a1 oligo d(T)-cellulose 
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(10 mg/bil in 1 X loading buffer) aie added to the reaction mixture, and incubated at ambient 
temperature with shaking for 30 minutes. (2) Reaction mixtures are txansfened to a 96-welI filtration 
plate attached to a vacuum appamtus. (3) Each sample is washed sequentially with thiee 2.4 mi 
aliquots of 1 x oligo d(T) loading buffer containing 0.5% SDS, 0,1% SDS, or no SDS, (4) RNA is 

5 eluted with 300 ill of water into a 96-well collection plate, transferred to scintillation vials containing 
liquid scintillant, and radioactivity determmed. 

Analysis of [me%PH]6-MP is as follows: (1) 500 jxl 0.5 M borate buffer, pH 10.0, and tiien 
2.5 ml of 20% (v/v) isoamyl alcohol in toluene are added to die reaction mixtures. (2) The samples 
are mixed by vigorous vortexing for ten seconds. (3) After centrifugation at 700g for 10 minutes, 1.5 

10 ml of the organic phase is transferred to scintillation vials containing 0.5 ml absolute eihanol and liquid 
scintillant, and radioactivity determined. (4) Results are corrected for the extraction of 6-MP into the 
organic phase (approximately 41%). 

In the alt^ative, type I topoisomerase activity of MDDT can be assayed based on the 
relaxation of a supercoiled DNA substrate. MDDT is incubated with its substrate in a buffer lacking 

15 Mg^ and ATP, the reaction is terminated, and the products are loaded on an agarose gel. Altered 
topoisomers can be distmguished from supercoiled substrate electiophoretically. This assay is specific 
for type I topoisomerase activiQr because Mg^ and ATP are necessary cofactors for type II 
tqpoisomerases. 

Type n topoisomerase activity of MDDT can be assayed based on the decatenation of a 
20 kinetoplast DNA (KDNA) substrate. MDDT is incubated with KDNA, the reaction is terminated, 
and the products are loaded on an agarose gel. Monomeric ctacular KDNA can be distmguished from 
catenated KDNA electrophcnetieally. Kits for measuring type I and type n topoisomerase activities 
are available commaxjially firom Tdpogen (Columbus OH). 

ATP-dq)endent RNA helicase unwinding activity of MDDT can be measured by the method 
25 described by Zbsaig and Grosse (1994; Biochemistry 33:3906-3912). Tlie substrate for RNA 

unwinding consists of ^^-labeled RNA composed of two RNA strands of 194 and 130 nucleotides m 
length containing a duplex region of 17 base-pairs. The RNA substrate is incubated together with 
ATP, Mg^, and varying amounts of MDDT in a Tris-HCl buffer, pH 7.5, at 37*C for 30 minutes. 
The single-stranded RNA product is then separated from the double-stranded RNA substrate by 
30 electrophoresis through a 10% SUS-polyacrylamide gel, and quantitated by autoradiography. The 
amount of single-stranded RNA recovered is proportional to the amount of MDDT in the preparation. 

In the alternative, MDDT function is assessed by expressing the sequences encoding MDDT 
at physiologically elevated levels in mammalian cell culture systems. cDNA is subcloned into a 
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m ammalian expression vector containing a strong promoter that drives high levels of cDNA 
expression. Vectors of choice include pCMV SPORT (Life Technologies) and pCR3.1 (Invitrogen 
CcMporation, Carlsbad CA), both of which contain the cytomegalovirus promoter. 5-10 fig of 
recombinant vector are transiently transfected into a human cell line, preferably of endothelial or 
5 hematopoietic origin, using either liposome formulations or electroporation. 1-2 /zg of an additional 
plasmid contaming sequences encoding a marker protein are co-transfected. 

Expression of a marker protein provides a means to distinguish transfected cells firom 
nontransfected cells and is a reliable predictor of cDNA expression from die recombinant vector. 
Marker proteins of choice include, e.g.. Green Fluorescent Protein (OFP; CLONIECH), CD64, or a 

10 CD64-GFP fusion protein. How cytometry (FCM), an automated laser optics-based technique, is 
used to identify transfected cells expressing OFP or CD64-GFP and to evaluate the apoptotic state of 
the cells and other cellular properties. 

FCM detects and quantifies die uptake of fluorescent molecules that diagnose events 
preceding or coincident with ceD death, lliese events include changes in nuclear DNA content as 

15 measured by stainmg of DNA with propidium iodide; changes in cell size and granularity as measured 
by forward light scatter and 90 degree side light scatter; down-regulation of DNA synthesis as 
measured by decrease in hromodeoxyuridine uptake; alterations m expression of cell surface and 
intracellular proteins as measured by reactivity with specific antibodies; and alterations in plasma 
membrane composition as measured by the binding of fluorescein-conjugated Annexin V protein to the 

20 cell surface. Methods in flow cytometry are discussed in Ormerod, M. G. (1994) Flow Cytometry, 
Oxford, New York NY. 

The influence of MDDT on gene expression can be assessed using highly purified populations 
of cells transfected with sequences encoding MDDT and either CD64 or CD64-GFP. CD64 and 
CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human 

25 immunoglobulin G (IgG). Transfected cells are efficienfly separated firom nontransfected cells using 
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Inc., Lake 
Success NY). mRNA can be purified from tiie cells using metiiods well known by those of skill in the 
art. Expression of mRNA encoding MDDT and other genes of interest can be analyzed by northern 
analysis or micrbanray techniques. 

30 Pseudouridine synthase activify of MDDT is assayed using a tritium (^H) release assay 

modified from Nurse et al. {(1995) RNA 1:102-112), which measures die release of % from die 
position of the pyrimidine component of uridylate (U) when ^-radiolabeled U m RNA is isomerized to 
pseudouridine (y). A typical 500 fjl assay mixture contains 50 mM HEPES buffer (pH 7.5), 100 mM 
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ammonium acetate, 5 mM dithiothieitol, 1 mM EDTA, 30 units RNase inhibitor, and 0.1-4.2 fiM 
[S-^HQtRNA (approximately 1 fiCi/nmol tRNA). Thie reaction is initiated by the addition of <5 jil of a 
concentrated solution of MDDT (or sample containing MDDT) and incubated for 5 min at 37 **C. 
Portions of the reaction mixture are removed at various times (up to 30 min) following the addition of 
5 MDDT and quenched by dilution into 1 ml 0.1 M HCl containing Norit-SA3 (12% w/v). The 

quenched reaction mixtures are centrifuged for S min at maximum speed in a microcentrifuge, and the 
supomatants are filtered through a plug of glass wool. The pellet is washed twice by resuspension in 1 
ml 0.1 M HCl, followed by centcifugation. The supematants from the washes are separately passed 
through the glass wool plug and combined witii die oiigmal filtrate. A portion of tiie combined filtrate 
10 is mixed witii scintillation fluid (up to 10 ml) and counted using a scintillation counter. The amount of 
released from the RNA and present in tiie soluble filtrate is proportional to the amount of 
peudouridine synfliase activity m the sample (Ramamurthy, V. (1999) J. Biol. Chwn. 
274:22225-22230). 

In the alternative, pseudouridine synfliase activiQr of MDDT is assayed at 30 to 37 in a 
15 mixture containing 100 mM Tris-HCl (pH 8.0), 100 mM ammonium acetate, 5 mM MgClz, 2 mM 
dittiiotiureitol, 0.1 mM EDTA, and 1-2 finol of p^]-radiolabeled runoff transcripts (generated in vitro 
by an appropriate KNA polymerase, i.e., T7 or SP6) as substrates. MDDT is added to mitiate tiie 
reaction or omitted from the reaction in control samples. Following incubation, the RNA is extracted 
with phenol-chlorofomi, precipitated in eflianol, and hydrolyzed completely to 3-nucleotide 
20 monophosphates using RNase Tj. The hydrolysates are analyzed by two-dunensional thin lay^ 
chromatography, and the amount of ^*P radiolabel present in the yMP and UMP spots are evaluated 
after exposing the thin kiyer chromatography plates to film or a Phosphorlmag^ screen. Taking into 
account die relative number of uridylate residues in the substrate RNA, tiie relative amount yMP and 
UMP are determined and used to calculate the relative amount of y per tElNA molecule (expressed m 
25 mol y /mol of tRNA or mol y /mol of tRNA/minute), which corresponds to the amount of 
pseudouridine syntiiase activity in flie MDDT sample (Lecointe, R et al. (1998) J. Biol. Chem. 
273:1316-1323). 

N^,N^-dimethylguanosine transferase ((m\G)metiiyItransferase) activity of MDDT is 
measured m a 160 nl reaction mixture containing 100 mM Tris-HQ (pH 7.5), 0.1 mM EDTA, 10 mM 
30 MgCl2, 20 mM NH4a, ImM difluotiueitol, 6.2 ftM S-adenosyl-L-[wc%Z-^H]mefliionine (30-70 
Ci/mM), 8 fig m^jO-deficient tRNA or wild type tRNA from yeast, and approximately 100 fig of 
purified MDDT or a sample comprising MDDT. The reactions are incubated at 30 for 90 min and 
chilled on ice. A portion of each reaction is diluted to 1 ml m water containing 100 jtig BSA. 1 ml of 2 
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M HCl is added to each sample and the acid insoluble products are allowed to precipitate on ice for 20 
min before being collected by filtration through glass fiber filters. The collected material is washed 
several times with HCl and quantitated using a liquid scintillation counter. The amount of ^ 
incorporated into the m\G-deficient« acid-insoluble tRNAs is proportional to the amount of 
5 N^,N^-dimefhylguanosine transferase activity in the MDDT sample. Reactions comprising no 
substrate tRNAs, or wild-type tRNAs that have ateady been modified, serve as control reactions 
which should not yield acid-insoluble %-labeled products- 

Polyadenylation activity of MDDT is measured using an in vitro polyadenylation reaction. The 
reaction mixture is assembled on ice and comprises 10 fil of S mM dithiothrdtol, 0.025% (v/v) 

10 NONIDET P-40, 50 mM creatine phosphate, 6.5% (w/v) polyvinyl alcohol, 0.5 unit/ftl RNAGUARD 
(Pharmacia), 0.025 ng/nl creatine kmase, 1.25 mM cordycepin 5 -triphosphate, and 3.75 mM MgCl2» in 
a total volume of 25 jxl. 60 finol of CstF, 50 finol of CPSF, 240 fmol of PAP, 4 /il of crude or partially 
purified CF n and various amounts of amounts CF I are then added to the reaction mix. Hie volume 
is adjusted to 23.5 fil with a bujffer containing 50 mM TrisHCl, pH 7.9, 10% (v/v) glycerol, and 0.1 mM 

15 Na-EDTA. The final ammonium sulfate concentration should be below 20 inM. The reaction is 
mitiated (on ice) by the addition of 15 finol of ^^P-labeled pre-mRNA template, along with 2.5 fig of 
unlabeled tRNA, in 1.5 /il of water. Reactions are then incubated at 30 for 75-90 min and stopped 
by the addition of 75 ptl (approximately two-volumes) of proteinase K mix (0.2 M Ttis-HCl, pH 7.9, 
300 mM NaQ, 25 mM Na-EDTA, 2% (w/v) SDS), 1 pA of 10 mg/ml protemase K, 0.25 /il of 20 mg/ml 

20 glycogen, and 23.75 fil of water). Following incubation, the RNA is precipitated with ethanol and 
analyzed on a 6% (w/v) polyacrylamide, 8.3 M urea sequencing gel. The dried gel is developed by 
autoradiography or using a phosphoimager. Cleavage activity is determined by comparing the amount 
of cleavage product to the amount of pre-noRNA template. The omission of any of the polypeptide 
components of the reaction and substitution of MDDT is useful fear identi^ng the specific biological 

25 function of MDDT m pre-mRNA polyadenylation (Rtiegsegger, U. et al. (1996) J. Biol. Chem. 
271:6107-6113; and references within). 

tRNA synthetase activity is measured as the aminoacylation of a substmte tRNA in the 
presence of ["C]-labeled amino acid. MDDT is incubated with [^^-labeled amino acid and the 
appropriate cognate tRNA (for example, [*^]alanine and tRNA"^) in a buffered solution, ^^-labeled 

30 product is separated firom firee ["C]amino acid by chromatography, and the incorporated is 

quantified by scintUlation counter. The amount of '^-labeled product detected is proportional to the 
activity of MDDT in this assay. 

In the alternative, MDDT activity is measured by incubating a sample containing MDDT in a 
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solution contaimng 1 mM ATP, 5 mM Hepes-KOH (pH 7.0), 2.5 mM KCl, 1.5 mM magnesium 
chloride, and 0.5 mM DTI along with misacylated ['^]-Glu-tRNAGIn (e.g., 1 fxM) and a similar 
concentmtion of unlabeled L-glutamine. Following the quenching of the reaction with 3 M sodium 
acetate (pH 5.0), the naixture is extracted with an equal volume of water-saturated phenol, and the 

5 aqueous and organic phases are separated by centrifiigation at 15,000 x g at room temperature for 1 
min. Hie aqueous phase is removed and precipitated with 3 volumes of etfaanol at -70''C for 15 min. 
The precipitated aminoacyl-tRNAs are recovered by centrifiigation at 15,000 x g at 4**C forlS min. 
The pellet is lesuspended in of 25 mM KOH, deacylated at 65''C for 10 min., neutralized with 0.1 M 
HQ (to final pH 6-7), and dried under vacuum. The dried pellet is resuspended in water and spotted 

10 onto a cellulose TLC plate. Hie plate is developed in either isopropanol/fonnic acid/water or 
ammonia/water/chloroform/ methanol. The image is subjected to densitometric analysis and the 
relative amounts of Glu and Gin are calculated based on the Rf values and relative intensities of the 
spots. MDDT activity is calculated based on the amount of Gbi resulting from the transfcnrmation of 
Glu while acylated as Glu-tRNA^»» (adapted from Cumow, A.W. et al. (1997) Proc. Natl. Acad. Sd, 

15 94:11819-26). 

XDL Identification of MDDT Agonists and Antagonists 

Agonists or antagonists of MDDT activation or inliibition may be tested using the assays 
described in section XVn. Agonists cause an increase in MDDT activity and antagonists cause a 
decrease in MDDT activity. 

20 Various modifications and variations of the described mediods and systems of the invention 

will be apparent to diose skilled m the art without departing from the scope and spirit of the invention. 
Although the invention has been described in connection with certain embodiments, it should be 
understood that the invention as claimed should not be unduly limited to such specific embodiments. 
Indeed, various modifications of the described modes for carrying out the invention which are obvious 

25 to those skilled in molecular biology or related fields are intended to be within the scope of the 
followmg clauns. 
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transcription factor PD002421: L298-F462 
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Table 5 



Polynucleotide SEQ 
ID NO: 


Incyte Project ID: 


Representative Library 


24 


71230017CB1 


LUNGNOT35 


25 


3125036CB1 


LIVRNONOS 


26 


1758089CB1 


BRAITDR03 


27 


3533891CB1 


HELATXT05 


28 


1510943CB1 


OVARTUEOl 


29 


2119377CB1 


PANCNOT05 


30 


3176058CB1 


ADRENON04 


31 


2299818CB1 


BRABDIROl 


32 


272945 ICBl 


PROSNONOl 


33 


878534CB1 


PITUNOT03 


34 


2806157CB1 


BLADTUT08 


35 


5883626CB1 


LIVRNONOS 


36 


2674016CB1 


BEPINOTOl 


37 


5994159CB1 


SKINNOT05 


38 


2457335CB1 


ENDANOTOl 


39 


2267802CB1 


EPIPNOTOl 


40 


3212060CB1 


THYMNOT08 


41 


3121069CB1 


COLNTUT02 


42 


3280626CB1 


STOMFET02 


43 


484404CB1 


PROSTUT09 


44 


2830063CB1 


TLYMNOT03 


45 


7506096CB1 


TLYMNOT05 


46 


7505914CB1 


TLYNrrXT02 
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What is claimed is: 

1. An isolated polypeptide selected from the group consisting of : 

a) a polypeptide comprismg an amino acid sequence selected firom the group consistmg 
ofSEQIDNO:l-23, 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% 
identical to an amino acid sequence selected from tbe group consisting of SEQ ED 
N0:M4, SEQ ID N0:17, and SBQ ID NO:19-23, 

c) a naturally occurring polypeptide comprising an amino acid sequence at least 99% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:15-16 and SBQ ID NO:18, 

d) a biologically active fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO:l-23, and 

e) an immunogenic fragment of a polypeptide having an amino acid sequence selected 
from die group consisting of SEQ ID NO:l-23. 

2. An isolated polypeptide of claim 1 comprising an amino acid sequence selected from the 
group consisting of SEQ ID NO:l-23. 

3. An isolatedpolynucleotide encoding a polypeptide of claunl. 

4. An isolated polynucleotide encoding a polypqitide of claim 2. 

5. An isolated polynucleotide of claim 4 comprising a polynucleotide sequence selected from 
the group consisting of SEQ ID NO:24-46. 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

7. A cell transformed with a recombinant polynucleotide of claim 6. 

8. A transgenic organism comprising a recombinant polynucleotide of claim 6. 
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9. A method of producing a polypeptide of claim 1 , the method comprising: 

a) culturing a cell under conditions suitable for expression of the polypeptide, wherein 
said cell is transformed with a recombmant polynucleotide, and said recombinant 
polynucleotide comprises a promoter sequence operably linked to a polynucleotide 
encoding the polypeptide of claim 1, and 

b) recovering the polypeptide so expressed. 

10. A method of claim 9, wherein the polypeptide comprises an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-23. 

1 1. An isolated antibody which specifically binds to a polypeptide of claim 1. 

12. An isolated polynucleotide selected from the group consisting of: 

a) a polynucleotide comprising a polynucleotide sequence selected from tfie group 
consisting of SEQ ID NO:24-46, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consistmg of SEQ 
IDNO:24-46, 

c) a polynucleotide conq)lementary to a polynucleotide of a), 

d) a polynucleotide con^lementary to a polynucleotide of b), and 

e) an RNA equivalent of a)-d). 

13. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
polynucleotide of claim 12. 

14. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, tiie method comprising: 

a) hybridizing the sample witii a probe comprising at least 20 contiguous nucleotides 
conqmsing a sequence complementaiy to said target polynucleotide in die sample, and 
which probe specifically hybridizes to said target polynucleotide, under conditions 
whereby a hybridization complex is formed between said probe and said target 
polynucleotide or fragments fliereof, and 

b) detecting the presence or absence of said hybridization complex, and, optionally, if 



159 



wo 02/078420 



PCT/US02/09809 



present, the amount thereof. 

15. A metliod of claim 14, wheiein the probe comprises at least 60 contiguous nucleotides. 

16. A method of detectmg a target polynucleotide m a sample, said target polynucleotide 
having a sequence of a polynucleotide of claun 12, the method comprising: 

a) amplifymg said target polynucleotide or fragment thereof usmg polymerase chain 
reaction ampUfication, and 

b) detecting the presence or absence of said amplified target polynucleotide or fragment 
thereof, and, optionally, if present, the amount thereof. 

17. A composition comprising a polypeptide of claim 1 and a pharmaceutically acceptable 
excipient 

18. A composition of claim 17, wherein the polypeptide conqsrises an amino acid sequence 
selected from die group consisting of SEQ ID NO:l-23. 

19. A method for treating a disease or condition associated with decreased e;q>ression of 
functional MDDT, comprising administering to a patient in need of such treatment die composition of 
claim 17.. 

20. A method of screening a compound for effectiveness as an agonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting agonist activity in the sanq>le. 

21. A composition comprisiug an agonist compound identified by a method of claim 20 and a 
pharmaceutically accq)table excipient 

22. A method for treating a disease or condition associated widi decreased exin:ession of 
functional MDDT, comprising administering to a patient in need of such treatment a conqposition of 
claim 21. 
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23. A method of screening a coiiq)ound for effectiveness as an antagonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample conqxtising a polypeptide of claim 1 to a compound, and 

b) detecting antagonist activity in the sample. 
5 ' 

24. A composition comprising an antagonist compound identified by a method of claim 23 and 
a phaimaceutically accq)table excipient 

25. A metiiod for treatmg a disease or condition associated with overexpression of functional 
10 MDDT, comprising administering to a patient in need of such treatment a composition of claim 24. 

26. A method of screening for a compound that specifically binds to die polypeptide of claim 
1, the method comprising: 

a) combining the polypeptide of claim 1 witii at least one test compound under suitable 
15 conditions, and 

b) detectmg binding of die polypeptide of claim 1 to the test compound, thereby 
identifying a compound that specifically binds to the polypeptide of claim 1. 

27. A method of screening for a compound that modulates the activity of the polypeptide of 
20 claim 1, the metiiod comprising: 

a) combining die polypeptide of claim 1 with at least one test compound under conditions 
permissive for die activity of die polypeptide of claun 1, 

b) assessing tiie activity of die polypeptide of claim 1 m die presence of die test 
compound, and 

25 c) comparing the activity of die polypeptide of claun 1 m the presence of the test 

compound with the activity of die polypeptide of claim 1 in the absence of the test 
compound, wherein a change m the activity of the polyp^tide of claim 1 in the 
presence of the test compound is mdicative of a compound that modulates the activity 
of the polypeptide of claim 1. 

30 

28. A method of screening a compound for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a sequence of claim 5, die metiiod 
comprising: 
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a) exposing a sample comprising the target polynucleotide to a compound, under 
conditions suitable for the expression of the target polynucleotide, 

b) detecting altered expression of the target polynucleotide, and 

c) comparing the expression of the target polynucleotide in the presence of varying 
amounts of the compound and in the absence of die compound. 

29. A method of assessing toxicity of a test compound, the method comprising: 

a) treating a biological sample containing nucleic acids with the test compound, 

b) hybridizing the nucleic acids of the treated biological sample with a probe comprising 
at least 20 contiguous nucleotides of a polynucleotide of claim 12 under conditions 
whereby a specific hybridization complex is formed between said probe and a target 
polynucleotide in the biological sample, said target polynucleotide comprising a 
polynucleotide sequence of a polynucleotide of claim 12 or fragment thereof, 

c) quantifying the amount of hybridization conq)lex, and 

d) comparing the amount of hybridization complex in the treated biological sample with 
the amount of hybridization complex in an untreated biological sample, wherein a 
difference in the amount of hybridization complex in the treated biological sample is 
indicative of toxicity of the test compound. 

30. A diagnostic test for a condition or disease associated with the expression of MDDT in a 
biological sample, the method comprising: 

a) combining the biological sample with an antibody of claim 1 1, under conditions suitable 
for the antibody to bind the polypeptide and form an antibody :polypeptide complex, 
and 

b) detecting the complex, wherein the presence of the complex correlates with the 
presence of the polypeptide in the biological sample. 

31. The antibody of claim 1 1, wherein the antibody is: 

a) a chimeric antibody, 

b) a single chaiii antibody, 

c) a Fab fragment, 

d) a F(ab')2 fragment, or 

e) a humanized antibody. 
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32. A composition comprising an antibody of claim 11 and an acceptable excipient. 

33. A method of diagnosing a condition or disease associated with the expression of MDDT 
in a subject, comprising administering to said subject an effective amount of the composition of claim 
32. 

34. A composition of claim 32, wherein the antibody is labeled. 

35. A method of diagnosing a condition or disease associated with the expression of MDDT 
in a subject, comprising administering to said subject an effective amount of the composition of claim 
34. 

36. A method of preparing a polyclonal antibody with the specijScity of the antibody of claim 
1 1, the method comprising: 

a) i mmun i zing an animal with a polypeptide consisting of an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-23, or an inununogenic fragment 
thereof, under conditions to elicit an antibody response, 

b) isolating antibodies from said animal, and 

c) screening the isolated antibodies with the polypeptide, thereby identifying a polyclonal 
antibody which specifically binds to a polypeptide con^)rising an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-23. 

37. A polyclonal antibody produced by a method of claim 36. 

38. A composition comprising the polyclonal antibody of claim 37 and a suitable carrier. 

39. A method of making a monoclonal antibody with the specificity of the antibody of claim 
1 1 , the method comprising: 

a) immunizing an animal with a polypeptide consisting of an amino acid sequence 
selected from the group consisting of SEQ ID NO:l-23, or an immunogenic fragment 
thereof, under conditions to elicit an antibody response, 

b) isolating antibody producing cells from the animal, 

c) fusing the antibody producing cells with immortalized cells to form monoclonal 
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antibody-producing hybridoma cells, 

d) culturing the hybridoma cells, and 

e) isolating from the culture monoclonal antibody which specifically binds to a 
polypeptide comprising an amino acid sequence selected from the group consisting of 
SEQIDNO:l-23. 

40. A monoclonal antibody produced by a method of claim 39. 

41. A composition comprising the monoclonal antibody of claim 40 and a suitable carrier. 

42. The antibody of claim 1 1, wherein the antibody is produced by screening a Fab expression 

library. 

43. The antibody of claim 11, wherein die antibody is produced by screening a recombmant 
immunoglobulin library. 

44. A method of detecting a polypeptide comprising an ammo acid sequence selected from 
the group consisting of SEQ ID NO: 1-23 in a sample, the method comprismg: 

a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) detecting specific binding, wherein specific binding indicates the presence of a 
polypeptide comprising an amino acid sequence selected from the group consisting of 
SEQ ID NO: 1-23 in die sample. 

45. A method of purifying a polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO:l-23 from a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) separating the antibody from the sample and obtaining the purified polypeptide 
comprising an amino acid sequence selected from the group consisting of SEQ ID 
NO:l-23. 

46. A microarray wherein at least one element of the microarray is a polynucleotide of claim 
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13. 

47. A method of generating an expression profile of a sample which contains polynucleotides, 
the method comprising: 

5 a) labeling the polynucleotides of the sample, 

b) contacting the elements of the microairay of claim 46 with the labeled polynucleotides 
of the sample under conditions suitable for the formation of a hybridization complex, 
and 

c) quantifying flie expression of the polynucleotides in the sample. 

10 

48. An array comprising different nucleotide molecules affixed in distinct physical locations 
on a soUd substrate, wherein at least one of said nucleotide molecules comprises a first oligonucleotide 
or polynucleotide sequence specifically hybridizable with at least 30 contiguous nucleotides of a target 
polynucleotide, and wherein said target polynucleotide is a polynucleotide of claim 12. 

15 

49. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 30 contiguous nucleotides of said target polynucleotide. 

50. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
20 completely complementary to at least 60 contiguous nucleotides of said target polynucleotide. 

51. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to said target polynucleotide. 

25 52. An array of claim 48, which is a microarray. 

53. An array of claim 48, further comprising said target polynucleotide hybridized to a 
nucleotide molecule comprisiag said first oligonucleotide or polynucleotide sequence. 

30 54. An array of claim 48, wherein a linker joins at least one of said nucleotide molecules to 

said solid substrate. 

55. An array of claim 48, wherein each distinct physical location on the substrate contains 
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multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical 
location have the same sequence, and each distinct physical location on the substrate contains 
nucleotide molecules having a sequence which differs from the sequence of nucleotide molecules at 
another distinct physical location on the substrate. 

5 

56. A polypeptide of claim I, comprising tiie amino acid sequence of SEQ ID NO:l. 

57. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:2. 
10 58. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:3. 

59. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:4. 

60. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:5. 

15 

61. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:6. 

62. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:7. 
20 63. A polypeptide of claim 1, comprismg tiie amino acid sequence of SEQ ID NO:8. 

64. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:9. 

65. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 10. 

25 

66. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 1 1. 

67. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:12. 
30 68. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:13. 

69. A polypeptide of claim 1, comprising Uie amino acid sequence of SEQ ID NO: 14. 
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70. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO: 15. 

71. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:16. 

72. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:17. 

73. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 18. 

74. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 19. 

75. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:20. 

76. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:21. 

77. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:22. 

78. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:23. 

79. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:24. 

80. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:25. 

81. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:26. 

82. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:27. 

83. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:28. 

84. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:29. 

85. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:30. 

86. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID N0:31. 
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87. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:32. 

88. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:33, 

89. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:34. 

90. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:35. 

91. A polynucleotide of claim 12, conq)rising the polynucleotide sequence of SEQ ID NO:36. 

92. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:37. 

93. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:38. 

94. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:39. 

95. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:40. 

96. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID N0:41. 

97. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:42. 

98. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:43. 

99. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:44. 

100. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:45. 

101. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:46. 
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<110> INCYTE GENOMICS, INC. 
LU, DYUNG AINA M. 
ARVIZU, CHANDRA S. 
GANDHI, AMEENA R. 
HAPALIA, APRIL J, A. 
DING, LI 
LU, YAN 

RAMKUMAR, JAYALAXMI 
SWARNAKER, ANITA 
TANG, y. TOM 
YUE, HENRY 
TRAN, BAO 
LEE, SOO Y. 
WARREN, BRIDGET A. 
NGUYEN, DANNIEL B. 
THANGAVELU, KAVITHA 
YAO, MONIQUE G. 
ELLIOTT, VICKI S. 
BAUGHN, MARIAH R. 
EMERLING, BROOKE M. 
LAL, PREETI 
GIETZEN, KIMBERLY J. 
BECHA, SHANYA D. 
MARQUIS, JOSEPH P. 
KABLE, AMY E. 



<120> MOLECULES FOR DISEASE DETECTION AND TREATMENT 

'<:130> PP-0921 PCT 

<140> To Be Assigned 
<141> Herewith 

<150> 60/280,387; 60/282,335; 60/286,663; 60/285,484; 60/350,702; 60/351,749 

<151> 2001-03-30; 2001-04-05; 2001-04-13; 2001-04-19; 2002-01-18; 2002-01-25 

<160> 46 

<170> PERL Program 

<210> 1 
<211> 485 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc^feature 

<223> Incyte ID No: 71230017CD1 

<400> 1 

Met Asp Pro Thr Ala Leu Val Glu Ala lie Val Glu Glu Val Ala 

1 5 ,10 15 

cys Pro He Cys Met Thr Phe Leu Arg Glu Pro Met Ser He Asp 

20 25 30 

Cys Gly His Ser Phe Cys His Ser Cys Leu Ser Gly Leu Trp Glu 

35 40 45 
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He Pro Gly 
Arg Ala Pro 
Ala Asn' Val 
Met Gly Leu 
Lys Met Phe 
Ser Gin Ser 
Asp Val Ala 
His Leu Lys 
Glu Arg Lys 
Lys Gin Ser 
Glu Lys Lys 
Ala Ala Leu 
Lys Leu . Glu 
Leu Trp Arg 
Val Arg Tzp 
Lys Ser Trp 
Lys Thr Asp 
Tyr Ala Ala 
Leu He Val 
Asn Gin Lys 
He Val Leu 
Glu Val Glu 
Lys Gin Asn 
Tyr Gly Phe 
Ala Gly Thr 
Arg Arg Val 
Phe Tyr Asn 
Arg Tyr Pro 



Glu Ser 
50 

Val Gin 
65 

Val Glu 
80 

Lys Gly 
95 

Cys Lys 

110 
Pro Glu 

125 
Trp Glu 

140 
Lys Glu 

155 
Arg Thr 

170 
He Val 

185 
Gin Pro 

200 
Ala Ser 

215 
Leu Asn 

230 
Met He 

245 
Met Leu 

260 
Ser Leu 

275 
Cys Arg 

290 
Asp Val 

305 
Ser Glu 

320 
Leu Pro 

335 
Gly Ser 

350 
Val Gly 

365 
Val Asp 

380 
Trp Val 

395 
Asp Glu 

410 
Gly He 

425 
Val Thr 

440 
Phe Pro 

455 



Gin Asn Trp 
Pro Arg Asn 
Lys Val Arg 
Asp Leu Cys 
Glu Asp Val 
His Glu Ala 
Tyr Lys Trp 
Gin Glu Glu 
Ala Thr Trp 
Trp Glu Phe 
Pro His Arg 
Leu Gin Arg 
His Ser Glu 
Ala Glu Leu 
Gin Asp He 
Gin Gin Pro 
Val Leu Gly 
Arg Leu Asp 
Asp Arg Lys 
Asp Asn Pro 
Gin Cys He 
Asp Arg Ser 
Arg Lys Glu 
He Arg Leu 
Tyr Pro He 
Phe Val Asp 
Asp Cys Gly 
Gly Arg Leu 



Gly Tyr 
55 

Leu hrg 
70 

Leu Leu 

85 

Glu Arg 

100 
Leu He 

115 
His Ser 

130 
Glu Leu 

145 
Ala Trp 

160 
Lys He 

175 
Glu Lys 

190 
Gin Leu 

205 
Glu Ala 

220 
Leu He 

235 
Lys Glu 

250 
Gin Glu 

265 
Glu Pro 

280 
Leu Arg 

295 
Pro Asp 

310 
Arg Val 

325 
Glu Arg 

340 
Ser Ser 

355 
Glu Trp 

370 
Val Val 

385 
Arg Lys 

400 
Leu Ser 

415 
Tyr Glu 

430 
Ser His 

445 
Leu Pro 

460 



Thr Cys 
Pro Asn 
Arg Leu 
His Gly 
Met Cys 
Val Val 
His Glu 
Lys Leu 
Gin Val 
Tyr Gin 
Gly Ala 
Ala Glu 
Gin Gin 
Arg Ser 
Val Leu 
He Ser 
Glu He 
Thr Ala 
His Tyr 
Phe Tyr 
Gly Arg 
Gly Leu 
Tyr Leu 
Gly Asn 
Leu Pro 
Ala His 
He Phe 
Tyr Phe 



Pro Leu Cys 
60 

Trp Gin Leu 

75 

His Pro Gly 
90 

Glu Lys Leu 
105 

Glu Ala Cys 
120 

Pro Met Glu 
135 

Ala Leu Glu 
150 

Glu Val Gly 

165 

Glu Thr Arg 
180 

Arg Leu Leu 
195 

Glu Val Ala 
210 

Thr Met Gin 
225 

Ser Gin Val 
240 

Gin Arg Pro 
255 

Asn Arg Ser 
270 

Leu Glu Leu 
285 

Leu Lys Thr 
300 

Tyr Ser Arg 
315 

Gly Asp Thr 
330 

Arg Tyr Asn 
345 

His Tyr Trp 
360 

Gly Val Cys 

375 

Ser Pro His 
390 

Glu Tyr Arg 
405 

Val Pro Pro 
420 

Asp He Ser 
435 

Thr Phe Pro 
450 

Ser Pro Cys 
465 
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Tyr Ser lie Gly Thr Asn Asn Thr Ala Pro Leu Ala He Cys Ser 
470 475 480 

Leu Asp Gly Glu Asp 
485 

<210> 2 
<211> 1404 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 3125036CD1 

<400> 2 



Met 


Glu 


Ser 


Ser 


Ser 


Ser 


Asp 


Tyr Tyr Asn 


Lys Asp Asn Glu 


Glu 


1 








5 






10 




15 


Glu 


Ser 


Leu 


Leu 


Ala 


Asn 


Val 


Ala Ser Leu 


Arg His Glu Leu 


Lys 










20 






25 




30 


lie 


Thr 


Glu 


Trp 


Ser 


Leu 


Gin 


Ser Leu Gly 


Glu Glu Leu Ser 


Ser 










35 






40 




45 


Val 


Ser 


Pro 


Ser 


Glu 


Asn 


Ser 


Asp Tyr Ala 


Pro Asn Pro Ser 


Arg 










50 






55 




60 


Ser 


Glu 


Lys 


Leu 


He 


Leu 


Asp 


Val Gin Pro 


Ser His Pro Gly 


Leu 










65 






70 




75 


Leu 


Asn 


Tyr 


Ser 


Pro 


Tyr 


Glu 


Asn Val Cys 


Lys He Ser Gly 


Ser 










80 






85 




90 


Ser 


Thr 


Asp 


Phe 


Gin 


Lys 


Lys 


Pro Arg Asp 


Lys Met Phe Ser 


Ser 










95 






100 




105 


Ser 


Ala 


Pro 


Val 


Asp 


Gin 


Glu 


He Lys Ser 


Leu Arg Glu Lys 


Leu 










110 






115 


• 


120 


Asn 


Lys 


Leu 


Arg 


Gin 


Gin 


Asn 


Ala Cys Leu 


Val Thr Gin Asn 


His 










125 






130 




135 


Ser 


Leu 


Met 


Thr 


Lys 


Phe 


Glu 


Ser He His 


Phe Glu Leu Thr 


Gin 










140 






145 




150 


Ser 


Arg 


Ala 


Lys 


Val 


Ser 


Met 


Leu Glu Ser 


Ala Gin Gin Gin 


Ala 










155 






160 




165 


Ala 


Ser 


Val 


Pro 


He 


Leu 


Glu 


Glu Gin He 


He Asn Leu Glu 


Ala 










170 






175 




180 


Glu 


Val. 


Ser 


Ala 


Gin 


Asp 


Lys 


Val Leu Arg 


Glu Ala Glu Asn 


Lys 










185 






190 




195 


Leu 


Glu 


Gin 


Ser 


Gin 


Lys 


Met 


Val He Glu 


Lys Glu Gin Ser 


Leu 










200 






205 




210 


Gin 


Glu 


Ser 


Lys 


Glu 


Glu 


Cys 


He Lys Leu 


Lys Val Asp Leu 


Leu 










215 






220 




225 


Glu 


Gin 


Thr 


Lys 


Gin 


Gly 


Lys 


Arg Ala Glu 


Arg Gin Arg Asn 


Glu 










230 






235 




240 


Ala 


Leu 


Tyr 


Asn 


Ala 


Glu 


Glu 


Leu Ser Lys 


Ala Phe Gin Gin 


Tyr 










245 






250 




255 


Lys 


Lys 


Lys 


Val 


Ala 


Glu 


Lys 


Leu Glu Lys 


Val Lys Gly Ser 


Cys 










260 






265 




270 


Ala 


Asn 


Ser 


Val 


Phe 


Cys 


He 


Thr Val Tyr 


He Pro Thr Val 


Lys 










275 






280 




285 


Val 


Gin 


Ala 


Glu 


Glu 


Glu 


He 


Leu Glu Arg 


Asn Leu Thr Asn 


Cys 










290 






295 




300 


Glu 


Lys 


Glu 


Asn 


Lys 


Arg 


Leu 


•Gin Glu Arg 


Cys Gly Leu Tyr 


Lys 
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305 310 315 

Ser Glu Leu Glu lie Leu Lys Glu Lys Leu Arg Gin Leu Lys Glu 

320 325 330 

Glu Asn Asn Asn Gly Lys Glu Lys Leu Arg lie Met Ala Val Lys 

335 . 340 345 

Asn Ser Glu Val Met Ala Gin Leu Thr Glu Ser Arg Gin Ser lie 

350 355 360 

Leu Lys Leu Glu Ser Glu Leu Glu Asn Lys Asp Glu lie Leu Arg 

365 370 375 

Asp Lys Phe Ser Leu Met Asn Glu Asn Arg Glu Leu Lys Val Arg 

380 385 390 

Val Ala Ala Gin Asn Glu Arg Leu Asp Leu Cys Gin Gin Glu lie 

395 400 405 

Glu Ser Ser Arg Val Glu Leu Arg Ser Leu Glu Lys lie He Ser 

410 415 420 

Gin Leu Pro Leu Lys Arg Glu Leu Phe Gly Phe Lys Ser Tyr Leu 

425 430 435 

Ser Lys Tyr Gin Met Ser Ser Phe Ser Asn Lys Glu Asp Arg Cys 

440 445 450 

He Gly Cys Cys Glu Ala Asn Lys Leu Val He Ser Glu Leu Arg 

455 460 465 

He Lys Leu Ala He Lys Glu Ala Glu He Gin Lys Leu His Ala 

470 475 480 

Asn Leu Thr Ala Asn Gin Leu Ser Gin Ser Leu He Thr Cys Asn 

485 490 495 

Asp Ser Gin Glu Ser Ser Lys Leu Ser Ser Leu Glu Thr Glu Pro 

500 505 510 

Val Lys Leu Gly Gly His Gin Val Ala Glu Ser Val Lys Asp Gin 

515 520 525 

Asn Gin His Thr Met Asn Lys Gin Tyr Glu Lys Glu Arg Gin Arg 

530 535 540 

Leu Val Thr Gly He Glu Glu Leu Arg Thr Lys Leu He Gin He 

545 550 555 

Glu Ala Glu Asn Ser Asp Leu Lys Val Asn Met Ala His Arg Thr 

560 565 570 

Ser* Gin Phe Gin Leu He Gin Glu Glu Leu Leu Glu Lys Ala Ser 

575 580 585 

Asn Ser Ser Lys Leu Glu Ser Glu Met Thr Lys Lys Cys Ser Gin 

590 595 600 

Leu Leu Thr Leu Glu Lys Gin Leu Glu Glu Lys He Val Ala Tyr 

605 610 615 

Ser Ser He Ala Ala Lys Asn Ala Glu Leu Glu Gin Glu Leu Met 

620 625 630 

Glu Lys Asn Glu Lys He Arg Ser Leu Glu Thr Asn He Asn Thr 

635 640 645 

Glu His Glu Lys He Cys Leu Ala Phe Glu Lys Ala Lys Lys He 

650 655 660 

His Leu Glu Gin His Lys Glu Met Glu Lys Gin He Glu Arg Val 

665 670 675 

Arg Gin Leu Asp Ser Ala Leu Glu He Cys Lys Glu Glu Leu Val 

680 685 690 

Leu His Leu Asn Gin Leu Glu Gly Asn Lys Glu Lys Phe Glu Lys 

695 700 705 

Gin Leu Lys Lys Lys Ser Glu Glu Val Tyr Cys Leu Gin Lys Glu 

710 715 720 

Leu Lys He Lys Asn His Ser Leu Gin Glu Thr Ser Glu Gin Asn 
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725 730 735 

Val lie Leu Gin His Thr Leu Gin Gin Gin Gin Gin Met Leu Gin 

740 745 750 

Gin Glu Thr He Atg Asn Gly Glu Leu Glu Asp Thr Gin Thr Lys 

755 760 765 

Leu Glu Lys Gin Val Ser Lys Leu Glu Gin Glu Leu Gin Lys Gin 

770 775 780 

Arg Glu Ser Ser Ala Glu Lys Leu Arg Lys Met Glu Glu Lys Cys 

785 790 795 

Glu Ser Ala Ala His Glu Ala Asp Leu Lys Arg Gin Lys Val He 

800 805 810 

Glu Leu Thr Gly Thr Ala Arg Gin Val Lys He Glu Met Asp Gin 

815 820 825 

Tyr Lys Glu Glu Leu Ser Lys Met Glu Lys Glu He Met His Leu 

830 835 840 

Lys Arg Asp Gly Glu Asn Lys Ala Met His Leu Ser Gin Leu Asp 

845 850 855 

Met He Leu Asp Gin Thr Lys Thr Glu Leu Glu Lys Lys Thr Asn 

860 865 870 

Ala Val Lys Glu Leu Glu Lys Leu Gin His Ser Thr Glu Thr Glu 

875 880 885 

Leu Thr Glu Ala Leu Gin Lys Arg Glu Val Leu Glu Thr Glu Leu 

890 895 900 

Gin Asn Ala His Gly Glu Leu Lys Ser Thr Leu Arg Gin Leu Gin 

905 910 915 

Glu Leu Arg Asp Val Leu Gin Lys Ala Gin Leu Ser Leu Glu Glu 

920 925 930 

Lys Tyr Thr Thr He Lys Asp Leu Thr Ala Glu Leu Arg Glu Cys 

935 940 945 

Lys Met Glu He Glu Asp Lys Lys Gin Glu Leu Leu Glu Met Asp 

950 955 960 

Gin Ala Leu Lys Glu Arg Asn Trp Glu Leu Lys Gin Arg Ala Ala 

965 970 975 

Gin Val Thr His Leu Asp Met Thr He Arg Glu His Arg Gly Glu 

980 985 990 

Met Glu Gin Lys He He Lys Leu Glu Gly Thr Leu Glu Lys Ser 

995 1000 1005 

Glu Leu Glu Leu Lys Glu Cys Asn Lys Gin He Glu Ser Leu Asn 
1010 1015 1020 

Asp Lys Leu Gin Asn Ala Lys Glu Gin Val Arg Glu Lys Glu Phe 
1025 1030 1035 

He Met Leu Gin Asn Glu Gin Glu He Ser Gin Leu Lys Lys Glu 
1040 1045 1050 

He Glu Arg Thr Gin Gin Arg Met Lys Glu Met Glu Ser Val Met 
1055 1060 1065 

Lys Glu Gin Glu Gin Tyr He Ala Thr Gin Tyr Lys Glu Ala He 
1070 1075 1080 

Asp Leu Gly Gin Glu Leu Arg Leu Thr Arg Glu Gin Val Gin Asn 
1085 1090 1095 

Ser His Thr Glu Leu Ala Glu Ala Arg His Gin Gin Val Gin Ala 
1100 1105 1110 

Gin Arg Glu He Glu Arg Leu Ser Ser Glu Leu Glu Asp Met Lys 
1115 1120 1125 

Gin Leu Ser Lys Glu Lys Asp Ala His Gly Asn His Leu Ala Glu 
1130 1135 1140 

Glu Leu Gly Ala Ser Lys Val Arg Glu Ala His Leu Glu Ala Arg 
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1 1 4 

X JL40 






llbU 




1155 


rieu AXa. 


(jrjLu xxe 


Lys 


Lys 


Leu Ser 7U.a 


Glu Val Glu 


Ser Leu 




llbU 






life 

1165 




1170 


JjyS \a±\X AXa 


Tyr His 


Met 


Glu 


Met lie Ser 


His Gin Glu 


Asn His 




1175 






1180 




1185 


Ala Lys Trp 


Lys lie 


Ser Ala 


Asp Ser Gin 


Lys Ser Ser 


Val Gin 




1190 






1195 




1200 


Gin Leu Asn 


Glu Gin 


Leu Glu 


Lys Ala Lys 


Leu Glu Leu 


Glu Glu 




1205 






1210 




1215 


Ala Gin Asp 


Thr Val 


Ser Asn 


Leu His Gin 


Gin Val Gin 


Asp Arg 




1220 






1225 




1230 


Asn 61 u Val 


lie Glu 


Ala 


Ala 


Asn Glu Ala 


Leu Leu Thr 


Lys Glu 




1235 






1240 




1245 


d6x GXU. ij6U 


Thr Argf 


Leu 


Gin 


Ala Lys lie 


Ser Gly His 


Glu Lys 




1250 






1255 




1260 


Ala Glu Asp 


lie Lys 


Phe 


Leu 


Pro Ala Pro 


Phe Thr Ser 


Pro Thr 




1265 






1270 




1275 


Glu lie Met 


Pro Asp 


Val 


Gin 


Asp Pro Lys 


Phe Ala Lys 


Cys Phe 




1280 






1285 




1290 


His Thr Ser 


Phe Ser 


Lys 


Cys 


Thr Lys Leu 


Arg Arg Ser 


lie Ser 




1295 






1300 




1305 


Ala Ser Asp 


Leu Thr 


Phe 


Lys 


lie His Gly 


Asp Glu Asp 


Leu Ser 




1310 






1315 




1320 


Glu Glu Leu 


Leu Gin 


Asp 


Leu 


Lys Lys Met 


Gin Leu Glu 


Gin Pro 




1325 






1330 




1335 


Ser Thr Leu 


Glu Glu 


Ser 


His 


Lys Asn Leu 


Thr Tyr Thr 


Gin Pro 




1340 






1345 




1350 


Asp Ser Phe 


Lys Pro 


Leu 


Thr 


Tyr Asn Leu 


Glu Ala Asp 


Ser Ser 




1355 






1360 




1365 


Glu Asn Asn 


Asp Phe 


Asn Thr 


Leu Ser Gly 


Met Leu Arg 


Tyr lie 




1370 






1375 




1380 


Asn Lys Glu 


Val Arg 


Leu 


Leu 


Lys Lys Ser 


Ser Met Gin 


Thr Gly 




1385 






1390 




1395 


Ala Gly Leu 


Asn Gin 


Gly Glu 


Asn Val 








1400 












<210> 3 














<211> 1096 














<212> PRT 














<213> Homo sapiens 












<220> 














<221> misc_£eature 












<223> Incyte ID No: 


1758089CD1 






<400> 3 














Met Gly Ser Glu Asp 


His Gly Ala Gin Asn 


Pro Ser Cys 


Lys He 


1 


5 






10 




15 


Met Thr Phe Arg Pro 


Thr 


Met 


Glu Glu Phe 


Lys Asp Phe 


Asn Lys 




20 






25 




30 


Tyr Val Ala Tyr lie 


Glu 


Ser Gin Gly Ala 


His Arg Ala 


Gly Leu 




35 






40 




45 


Ala Lys lie 


lie Pro 


Pro 


Lys Glu Trp Lys 


Pro Arg Gin 


Thr Tyr 




50 






55 




60 


Asp Asp lie Asp Asp 


Val 


Val 


lie Pro Ala 


Pro lie Gin 


Gin Val 




65 






70 




75 
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VCIX JLlXi, 


Gly Gin 


Ser 




Leu 




xnr 


Gin 


Tyr 


Asn 


He 


Gin 


Lys 








OU 




















90 


jjyB A±cl 


Met 


Thr 


ITa 1 

vslx 


G±y 


UlU 


Tyr 


Arg 


Arg 


Leu 


Ala 


Asn 


Ser 


Glu 


















100 










105 




Cys 


Thr 


trTO 


Arg 


liXS 


Gxn 


Asp 


Pne 


Asp 


Asp 


Leu 


Glu 


Arg 


















115 










120 




Trp Lys 


Asn 


T All 

Leu 


Thr 


Fne 


Val 


Ser 


Pro 


He 


Tyr 


Gly 


Ala 


















130 










135 




Ser Gly 


ser 


Leu 


Tyr 


Asp 


Asp 


Asp 


Val 


Ala 


Gin 


Trp 


Asn 


















145 










150 




Ser 


Leu 


Arg 


Thr 


xie 


Leu 


Asp 


Met 


Val 


Glu 


Arg 


Glu 


Cys 








ICC 

155 










160 










165 


v»j.y xiuT 


He 


He 


Glu 


Gly 


Val 


Asn 


Thr 


Pro 


Tyr 


Leu 


Tyr 


Phe 


Gly 








170 










175 










180 


1urA4- m u 1 LI L 


Lys 


Thr 


Thr 


Pne 


Ala 


Trp 


His 


Thr 


Glu 


Asp 


Met 


Asp 


Leu 








185 










190 










195 




He 


Asn 


Tyr 


Leu 


His 


Phe 


Gly 


Glu 


Pro 


Lys 


Ser 


Trp 


Tyr 








200 










205 










210 


iiJLa. xj.e 


Pro 


Pro 


Glu 


His 


Gly 


Lys 


Arg 


Leu 


Glu 


Arg 


Leu 


Ala 


He 








215 










220 










225 




Phe 


Pro 


Gly 


Ser 


Ser 


Gin 


Gly 


Cys 


Asp 


Ala 


Phe 


Leu 


Arg 








O O rt 










235 










240 




Met 


Thr 


Leu 


xj.e 


Ser 


Pro 


He 


He 


Leu 


Lys 


Lys 


Tyr 


Gly 








245 










250 










255 


±16 Fro 


Phe 


Ser 


Arg 


He 


Thr 


Gin 


Glu 


Ala 


Gly 


Glu 


Phe 


Met 


He 








260 










265 










270 


Thr Phe 


Pro Tyr 


Gly 


Tyr 


His 


Ala 


Gly 


Phe 


Asn 


His 


Gly 


Phe 


Asn 








275 










280 










285 


Cys Ala 


Glu 


Ser 


Thr 


Asn 


Phe 


Ala 


Thr 


Leu 


Arg 


Trp 


He 


Asp 


Tyr 








290 










295 










300 


Gly Lys 


Val 


Ala 


Thr 


Gin 


Cys 


Thr 


Cys 


Arg 


Lys 


Asp 


Met 


Val 


Lys 








305 










310 










315 


lie Ser 


Met Asp 


Val 


Phe 


Val 


Arg 


He 


Leu 


Gin 


Pro 


Glu 


Arg 


Tyr 








320 










325 










330 


Glu Leu 


Trp 


Lys 


Gin 


Gly 


Lys 


Asp 


Leu 


Thr 


Val 


Leu 


Asp 


His 


Thr 








335 










340 










345 


Arg Pro 


Thr 


Ala 


Leu 


Thr 


Ser 


Pro 


Glu 


Leu 


Ser 


Ser 


Trp 


Ser 


Ala 








350 










355 










360 


Ser Arg 


Ala 


Ser 


Leu 


Lys 


Ala 


Lys 


Leu 


Leu 


Arg 


Arg 


Ser 


His 


Arg 








365 










370 










375 


Lys Arg 


Ser 


Gin 


Pro 


Lys 


Lys 


Pro 


Lys 


Pro 


Glu 


Asp 


Pro 


Lys 


Phe 








380 










385 










390 


Pro Gly 


Glu Gly 


Thr 


Ala 


Gly 


Ala 


Ala 


Leu 


Leu 


Glu 


Glu 


Ala 


Gly 








395 










400 










405 


Gly Ser 


Val 


Lys 


Glu 


Glu 


Ala 


Gly 


Pro 


Glu 


Val 


Asp 


Pro 


Glu 


Glu 








410 










415 










420 


Glu Glu 


Glu 


Glu 


Pro 


Gin 


Pro 


Leu 


Pro 


His 


Gly 


Arg 


Glu 


Ala 


Glu 








425 










430 










435 


Gly Ala. 


Glu 


Glu 


Asp 


Gly 


Arg 


Gly 


Lys 


Leu 


Arg 


Pro 


Thr 


Lys 


Ala 








440 










445 










450 


Lys Ser 


Glu 


Arg 


Lys 


Lys 


Lys 


Ser 


Phe 


Gly 


Leu 


Leu 


Pro 


Pro 


Gin 








455 










460 










465 


Leu Pro 


Pro 


Pro 


Pro 


Ala 


His 


Phe 


Pro 


Ser 


Glu 


Glu 


Ala 


Leu 


Trp 








470 










475 










480 


Leu Pro 


Ser 


Pro 


Leu 


Glu 


Pro 


Pro 


Val 


Leu 


Gly 


Pro 


Gly 


Pro 


Ala 








485 










490 










495 
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Ala Met Glu Glu Ser Pro Leu Pro Ala Pro Leu Asn Val Val Pro 

500 505 510 

Pro Glu Val Pro Ser Glu Glu Leu Glu Ala Lys Pro Arg Pro lie 

515 520 525 

lie Pro Met Leu Tyr Val Val Pro Arg Pro Gly Lys Ala Ala Phe 

530 535 540 

Asn Gin Glu His Val Ser Cys Gin Gin Ala Phe Glu His Phe Ala 

545 550 555 

Gin Lys Gly Pro Thr Trp Lys Glu Pro Val Ser Pro Met Glu Leu 

560 565 570 

Thr Gly Pro Glu Asp Gly Ala Ala Ser Ser Gly Ala Gly Arg Met 

575 580 585 

Glu Thr Lys Ala Arg Ala Gly Glu Gly Gin Ala Pro Ser Thr Phe 

590 595 600 

Ser Lys Leu Lys Met Glu lie Lys Lys Ser Arg Arg His Pro Leu 

605 610 615 

Gly Arg Pro Pro Thr Arg Ser Pro Leu Ser Val Val Lys Gin Glu 

620 625 630 

Ala Ser Ser Asp Glu Glu Ala Ser Pro Phe Ser Gly Glu Glu Asp 

635 640 645 

Val Ser Asp Pro Asp Ala Leu Arg Pro Leu Leu Ser Leu Gin Trp 

650 655 660 

Lys Asn Arg Ala Ala Ser Phe Gin Ala Glu Arg Lys Phe Asn Ala 

665 670 675 

Ala Ala Ala Arg Thr Glu Pro Tyr Cys Ala He Cys Thr Leu Phe 

680 685 690 

Tyr Pro Tyr Cys Gin Ala Leu Gin Thr Glu Lys Glu Ala Pro He 

695 700 705 

Ala Ser Leu Gly Glu Gly Cys Pro Ala Thr Leu Pro Ser Lys Ser 

710 715 720 

Arg Gin Lys Thr Arg Pro Leu He Pro Glu Met Cys Phe Thr Ser 

725 730 735 

Gly Gly Glu Asn Thr Glu Pro Leu Pro Ala Asn Ser Tyr lie Gly 



Gin Val His Ala Ser Cys Tyr Gly He Arg Pro Glu Leu Val Asn 

770 775 780 

Glu Gly Trp Thr Cys Ser Arg Cys Ala Ala His Ala Trp Thr Ala 

785 790 795 

Glu Cys Cys Leu Cys Asn Leu Arg Gly Gly Ala Leu Gin Met Thr 

800 805 810 

Thr Asp Arg Arg Trp He His Val He Cys Ala He Ala Val Pro 

815 820 825 

Glu Ala Arg Phe Leu Asn Val He Glu Arg His Pro Val Asp He 

830 835 840 

Ser Ala He Pro Glu Gin Arg Trp Lys Leu Lys Cys Val Tyr Cys 

845 850 855 

Arg Lys Arg Met Lys Lys Val Ser Gly Ala Cys He Gin Cys Ser 

860 865 870 

Tyr Glu His Cys Ser Thr Ser Phe His Val Thr Cys Ala His Ala 

875 880 885 

Ala Gly Val Leu Met Glu Pro Asp Asp Trp Pro Tyr Val Val Ser 

890 895 900 

He Thr Cys Leu Lys His Lys Ser Gly Gly His Ala Val Gin Leu 



Asp Asp 



740 

Gly Thr Ser Pro Leu 
755 



745 

He Ala Cys Gly Lys 
760 



750 

Cys Cys Leu 
765 



905 



910 



915 
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Leu Arg Ala Val Ser Leu Gly Gin Val Val lie Thr Lys Asn Arg 
920 925 930 

Asn Gly Leu Tyr Tyr Arg Cys Arg Val He Gly Ala Ala Ser Gin 
935 940 945 

Thr Cys Tyr Glu Val Asn Phe Asp Asp Gly Ser Tyr Ser Asp Asn 
950 955 960 

Leu Tyr Pro Glu Ser He Thr Ser Arg Asp Cys Val Gin Leu Gly 
965 970 975 

Pro Pro Ser Glu Gly Glu Leu Val Glu Leu Arg Trp Thr Asp Gly 
980 985 990 

Asn Leu Tyr Lys Ala Lys Phe He Ser Ser Val Thr Ser His He 
995 1000 1005 

Tyr Gin Val Glu Phe Glu Asp Gly Ser Gin Leu Thr Val Lys Arg 
1010 1015 1020 

Gly Asp He Phe Thr Leu Glu Glu Glu Leu Pro Lys Arg Val Arg 
1025 1030 1035 

Ser Arg Leu Ser Leu Ser Thr Gly Ala Pro Gin Glu Pro Ala Phe 
1040 1045 1050 

Ser Gly Glu Glu Ala Lys Ala Ala Lys Arg Pro Arg Val Gly Thr 
1055 1060 1065 

Pro Leu Ala Thr Glu Asp Ser Gly Arg Ser Gin Asp Tyr Val Ala 
1070 1075 1080 

Phe Val Glu Ser Leu Leu Gin Val Glri Gly Arg Pro Gly Ala Pro 
1085 1090 1095 

Phe 



<210> 4 

<211> 167 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 3533891CD1 

<400> 4 

Met Tyr Met Gly Met Met Cys Thr Ala Lys Lys Cys Gly He Arg 
15 10 15 

Phe Gin Pro Pro Ala He He Leu He Ty^: Glu Ser Glu He Lys 

20 25 30 

Gly Lys He Arg Gin Arg He Met Pro Val Arg Asn Phe Ser Lys 

35 40 45 

Phe Ser Asp Cys Thr Arg Ala Ala Glu Gin Leu Lys Asn Asn Pro 

50 55 60 

Arg His Lys Ser Tyr Leu Glu Gin Val Ser Leu Arg Gin Leu Glu 

65 70 75 

Lys Leu Phe Ser Phe Leu Arg Gly Tyr Leu Ser Gly Gin Ser Leu 

80 85 90 

Ala Glu Thr Met Glu Gin He Gin Arg Glu Thr Thr He Asp Pro 

95 100 105 

Glu Glu Asp Leu Asn Lys Leu Asp Asp Lys Glu Leu Ala Lys Arg 
110 115 120 

Lys Ser He Met Asp Glu Leu Phe Glu Lys Asn Gin Lys Lys Lys 
125 130 135 

Asp Asp Pro Asn Phe Val Tyr Asp He Glu Val Glu Phe Pro Gin 
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140 145 150 

Asp Asp Gin Leu Gin Ser Cys Gly Trp Asp Thr Glu Ser Ala Asp 
155 160 165 

Glu Phe 



<210> 5 

<211> 1523 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 1510943CD1 

<400> 5 



Met 


Thr 


Ser 


Val 


Trp 


Lys 


Arg 


Leu Gin 


Arg 


Val Gly Lys Arg Ala 


1 








5 








10 


15 


Ala 


Lys 


Phe 


Gin 


Phe 


Val 


Ala 


Cys Tyr 


His 


Glu Leu Val Leu Glu 










20 








25 


30 


Cys 


Thr 


Lys 


Lys 


Trp 


Gin 


Pro 


Asp Lys 


Leu 


Val Val Val Trp Thr 










35 








40 


45 


Arg 


Arg 


Asn 


Arg 


Arg 


He 


Cys 


Ser Lys 


Ala 


His Ser Trp Gin Pro 










50 








55 


60 


Gly 


He 


Gin 


Asn 


Pro 


Tyr 


Arg 


Gly Thr 


Val 


Val Trp Met Val Pro 










65 








70 


75 


Glu 


Asn 


Val 


Asp 


He 


Ser 


Val 


Thr Leu 


Tyr 


Arg Asp Pro His Val 










80 








85 


90 


Asp 


Gin 


Tyr 


Glu 


Ala 


Lys 


Glu 


Trp Thr 


Phe 


He He Glu Asn Glu 










95 








100 


105 


Ser 


Lys 


Gly 


Gin 


Arg 


Lys 


Val 


Leu Ala 


Thr 


Ala Glu Val Asp Leu 










110 








115 


120 


Ala 


Arg 


His 


Ala 


Gly 


Pro 


Val 


Pro Val 


Gin 


Val Pro Leu Arg Leu 










125 








130 


135 


Arg 


Leu 


Lys 


Pro 


Lys 


Ser 


Val 


Lys Val 


Val 


Gin Ala Glu Leu Ser 










140 








145 


150 


Leu 


Thr 


Leu 


Ser 


Gly 


Val 


Leu 


Leu Arg 


Glu 


Gly Arg Ala Thr Asp 










155 








160 


165 


Asp 


Asp 


Met 


Gin 


Ser 


Leu 


Ala 


Ser Leu 


Met 


Ser Val Lys Pro Ser 










170 








175 


180 


Asp 


Val 


Gly 


Asn 


Leu 


Asp 


Asp 


Phe Ala 


Glu 


Ser Asp Glu Asp Glu 










185 








190 


195 


Ala 


His 


Gly 


Pro 


Gly 


Ala 


Pro 


Glu Ala 


Arg 


Ala Arg Val Pro Gin 










200 








205 


210 


Pro 


Asp 


Pro 


Ser 


Arg 


Glu 


Leu 


Lys Thr 


Leu 


Cys Glu Glu Glu Glu 










215 








220 


225 


Glu 


Gly 


Gin 


Gly 


Arg 


Pro 


Gin 


Gin Ala 


Val 


Ala Ser Pro Ser Asn 










230 








235 


240 


Ala 


Glu 


Asp 


Thr 


Ser 


Pro 


Ala 


Pro Val 


Ser 


Ala Pro Ala Pro Pro 










245 








250 


255 


Ala 


Arg 


Thr 


Ser 


Arg 


Gly 


Gin 


Gly Ser 


Glu 


Arg Ala Asn Glu Ala 










260 








265 


270 


Gly 


Gly 


Gin 


Val 


Gly 


Pro 


Glu 


Ala Pro 


Arg 


Pro Pro Glu Thr Ser 










275 








280 


285 


Pro 


Glu 


Met 


Arg 


Ser 


Ser 


Arg 


Gin Pro 


Ala 


Gin Asp Thr Ala Pro 










290 








295 


300 
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Thr 


Pro 


Ala 


Pro 


Arg 


Leu 


Arg 


Lys 


Gly 


Ser 


Asp 


Ala 


Leu 


Arg 


Pro 
315 


Pro 


Val 


Pro 


Gin 


Gly 


Glu 


Asp 


Glu 


Val 


Pro 

325 


Lys 


Ala 


Ser 


Gly 


Ala 
330 


Pro 


Pro 


Ala 


Gly 


Leu 

1 t; 

J ^ D 


Gly 


Ser 


Ala 


Arg 


Glu 

o yi 


Thr 


Gin 


Ala 


Gin 


Ala 
345 


Cys 


Pro 


Gin 


Glu 


Gly 


Thr 


Glu 


Ala 


His 


Gly 
355 


Ala 


Arg 


Leu 


Gly 


Pro 
360 


Ser 


He 


Glu 


Asp 


Lys 

o £ e 

365 


Gly 


Ser 


Gly 


Asp 


Pro 
370 


Phe 


Gly 


Arg 


Gin 


Arg 
375 


Leu 


Lys 


Ala 


Glu 


Glu 
380 


Met 


Asp 


Thr 


Glu 


Asp 
385 


Arg 


Pro 


Glu 


Ala 


Ser 
390 


Gly 


Val 


Asp 


Thr 


Glu 

*a ft rr 

395 


Pro 


Arg 


Ser 


Gly 


Gly 
400 


Arg 


Glu 


Ala 


Asn 


Thr 
405 


Lys 


Arg 


Ser 


Gly 


Val 

410 


Arg 


Ala 


Gly 


Glu 


Ala 

415 


Glu 


Glu 


Ser 


Ser 


Ala 

420 


Val 


Cys 


Gin 


Val 


Asp 
425 


Ala 


Glu 


Gin 


Arg 


Ser 
430 


Lys 


Val 


Arg 


His 


Val 
435 


Asp 


Thr 


Lys 


Gly 


Pro 
440 


Glu 


Ala 


Thr 


Gly 


Val 
445 


Met 


Pro 


Glu 


Ala 


Arg 
450 


Cys 


Arg 


Gly 


Thr 


Pro 

455 


Glu 


Ala 


Pro 


Pro 


Arg 
460 


Gly 


Ser 


Gin 


Gly 


Arg 
465 


Leu 


Gly 


Val 


Arg 


Thr 
470 


Arg 


Asp 


Glu 


Ala 


Pro 
475 


Ser 


Gly 


Leu 


Ser 


Leu 
480 


Pro 


Pro 


Ala 


Glu 


Pro 
485 


Ala 


Gly 


His 


Ser 


Gly 

490 


Gin 


Leu 


Gly 


Asp 


Leu 

495 


Glu 


Gly 


Ala 


Arg 


Ala 
500 


Ala 


Ala 


Gly 


Gin 


Glu 
505 


Arg 


Glu 


Gly 


Ala 


Glu 
510 


Val 


Arg 


Gly 


Gly 


Ala 
515 


Pro 


Gly 


He 


Glu 


Gly 
520 


Thr 


Gly 


Leu 


Glu 


Gin 
525 


Gly 


Pro 


Ser 


Val 


Gly 
530 


Ala 


He 


Ser 


Thr 


Arg 
535 


Pro 


Gin 


Val 


Ser 


Ser 
540 


Trp 


Gin 


Gly 


Ala 


Leu 
545 


Leu 


Ser 


Thr 


Ala 


Gin 
550 


Gly 


Ala 


He 


Ser 


Arg 
555 


Gly 


Leu 


Gly 


Gly 


Trp 
560 


Glu 


Ala 


Glu 


Ala 


Gly 
565 


Gly 


Ser 


Gly 


Val 


Leu 
570 


Glu 


Thr 


Glu 


Thr 


Glu 

575 


val 


Val 


Gly 


Leu 


Glu 

580 


Val 


Leu 


Gly 


Thr 


Gin 

585 


Glu 


Lys 


Glu 


Val 


Glu 
590 


Gly 


Ser 


Gly 


Phe 


Pro 
595 


Glu 


Thr 


Arg 


Thr 


Leu 
600 


Glu 


He 


Glu 


He 


Leu 
605 


Gly 


Ala 


Leu 


Glu 


Lys 
610 


Glu 


Ala 


Ala 


Arg 


Ser 
615 


Arg 


Val 


Leu 


Glu 


Ser 
620 


Glu 


Val 


Ala 


Gly 


Thr 
625 


Ala 


Gin 


Cys 


Glu 


Gly 
630 


Leu 


Glu 


Thr 


Gin 


Glu 
635 


Thr 


Glu 


Val 


Gly 


Val 
640 


He 


Glu 


Thr 


Pro 


Gly 
645 


Thr 


Glu 


Thr 


Glu 


Val 
650 


Leu 


Gly 


Thr 


Gin 


Lys 
655 


Thr 


Glu 


Ala 


Gly 


Gly 
660 


Ser 


Gly 


Val 


Leu 


Gin 

665 


Thr 


Arg 


Thr 


Thr 


He 
670 


Ala 


Glu 


Thr 


Glu 


Val 

675 


Leu 


Val 


Thr 


Gin 


Glu 
680 


He 


Ser 


Gly 


Asp 


Leu 
685 


Gly 


Pro 


Leu 


Lys 


He 
690 


Glu 


Asp 


Thr 


He 


Gin 
695 


Ser 


Glu 


Met 


Leu 


Gly 
700 


Thr 


Gin 


Glu 


Thr 


Glu 
705 


Val 


Glu 


Ala 


Ser 


Arg 
710 


Val 


Pro 


Glu 


Ser 


Glu 
715 


Ala 


Glu 


Gly 


Thr 


Glu 
720 
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Ala Lys lie Leu Gly Thr Gin Glu lie Thr Ala Arg Asp Ser Gly 

725 730 735 

Val Arg Glu lie Glu Ala Glu He Ala Glu Ser Asp He Leu Val 

740 745 750 

Ala Gin Glu He Glu Val Gly Leu Leu Gly Val Leu Gly He Glu 

755 760 765 

Thr Gly Ala Ala Glu Gly Ala He Leu Gly Thr Gin Glu He Ala 

770 775 780 

Ser Arg Asp Ser Gly Val Pro Gly Leu Glu Ala Asp Thr Thr Gly 

785 790 795 

He Gin Val Lys Glu Val Gly Gly Ser Glu Val Pro Glu He Ala 

800 805 810 

Thr Gly Thr Ala Glu Thr Glu He Leu Gly Thr Gin Glu He Ala 

815 820 825 

Ser Arg Ser Ser Gly Val Pro Gly Leu Glu Ser Glu Val Ala Gly 

830 835 840 

Ala Gin Glu Thr Glu Val Gly Gly Ser Gly He Ser Gly Pro Glu 

845 850 855 

Ala Gly Met Ala Glu Ala Arg Val Leu Met Thr Arg Lys Thr Glu 

860 865 870 

He He Val Pro Glu Ala Glu Lys Glu Glu Ala Gin Thr Ser Gly 

875 880 885 

Val Gin Glu Ala Glu Thr Arg Val Gly Ser Ala Leu Lys Tyr Glu 

890 895 900 

Ala Leu Arg Ala Pro Val Thr Gin Pro Arg Val Leu Gly Ser Gin 

905 910 915 

Glu Ala Lys Ala Glu He Ser Gly Val Gin Gly Ser Glu Thr Gin 

920 925 930 

Val Leu Arg Val Gin Glu Ala Glu Ala Gly Val Trp Gly Met Ser 

935 940 945 

Glu Gly Lys Ser Gly Ala Trp Gly Ala Gin Glu Ala Glu Met Lys 

950 955 960 

Val Leu Glu Ser Pro Glu Asn Lys Ser Gly Thr Phe Lys Ala Gin 

965 970 975 

Glu Ala Glu Ala Gly Val Leu Gly Asn Glu Lys Gly Lys Glu Ala 

980 985 990 

Glu Gly Ser Leu Thr Glu Ala Ser Leu Pro Glu Ala Gin Val Ala 

995 1000 1005 

Ser Gly Ala Gly Ala Gly Ala Pro Arg Ala Ser Ser Pro Glu Lys 
1010 1015 1020 

Ala Glu Glu Asp Arg Arg Leu Pro Gly Ser Gin Ala Pro Pro Ala 
1025 1030 1035 

Leu Val Ser Ser Ser Gin Ser Leu Leu Glu Trp Cys Gin Glu Val 
1040 1045 1050 

Thr Thr Gly Tyr Arg Gly Val Arg He Thr Asn Phe Thr Thr Ser 
1055 1060 1065 

Trp Arg Asn Gly Leu Ala Phe Cys Ala He Leu His Arg Phe lyr 
1070 1075 1080 

Pro Asp Lys He Asp Tyr Ala Ser Leu Asp Pro Leu Asn He Lys 
1085 1090 1095 

Gin Asn Asn Lys Gin Ala Phe Asp Gly Phe Ala Ala Leu Gly Val 
1100 1105 1110 

Ser Arg Leu Leu Glu Pro Ala Asp Met Val Leu Leu Ser Val Pro 
1115 1120 1125 

Asp Lys Leu He Val Met Thr Tyr Leu Cys Gin He Arg Ala Phe 
1130 1135 1140 
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Cys Thr Gly Gin Glu Leu Gin 
1145 

Gly Ala Gly Thr Tyr Arg Val 
1160 

Asp Asp Leu Asp Ala Gly Gly 
1175 

Gly Ala Glu Gly Pro Gin Glu 
1190 

Asp Gly Ala Ala Pro Gly Val 
1205 

Arg Ala Ser Lys Asp Gly Gly 
1220 

Pro Ala Glu Val Pro Ala Glu 
1235 

Pro Gly Gly Gly Gly Val Arg 

1250 

Glu Pro Gly Ser Val Pro Pro 
1265 

His Val Arg Asp Ala Asp Leu 
1280 

Arg Asn Ser Ser Ser Phe Ser 

1295 

Met Gly Ala Ala Ala Ala Glu 
1310 

Ala Pro Gly Pro Pro Thr Ala 
1325 

Gly Ser Ser Pro Ser Glu Glu 

1340 

Ala Gly Leu Gin Arg Phe Gin 
1355 

Glu Leu Gin Ala Leu Glu Gin 
1370 

Ala Ala Glu Val Glu Met Gin 

' 1385 
Ala Asn Lys Leu Gin Glu Glu 
1400 

Leu Val Asn Lys Lys Asn Ala 

1415 

Gin Leu Leu Met Glu Glu Gin 
1430 

Leu Ser Arg Glu Leu Arg Ala 
1445 

Lys Thr Ser Ala Gin Gin His 
1460 

Leu Val Ser Leu Val Asn Gin 
1475 

Asp His Lys Glu Arg lie Ala 
1490 

Arg Gly Leu Glu Gin Arg Arg 
1505 

Arg Arg Glu Arg Cys Val Leu 
1520 

<210> 6 
<211> 273 
<212> PRT • 



Leu Val Gin 


Leu 


Glu Gly Gly Gly 


1150 




1155 


Gly Ser Ala 


Gin 


Pro Ser Pro Pro 


11S5 




1170 


Leu Ala Gin 


Arg 


Leu Arg Gly His 


1180 




1185 


Pro Lys Glu 


Ala 


Ala Asp Arg Ala 


1195 




1200 


Ala Ser Arg 


Asn 


Ala Val Ala Gly 


1210 




1215 


Ala Glu Ala 


Pro 


Arg Glu Ser Arg 


1225 




1230 


Gly Leu Val 


Asn 


Gly Ala Gly Ala 


1240 




1245 


Leu Arg Arg 


Pro 


Ser Val Asn Gly 


1255 




1260 


Pro Arg Ala 


His 


Gly Ser Phe Ser 


1270 




1275 


Leu Lys Lys 


Arg 


Arg Ser Arg Leu 


1285 




1290 


Met Asp Asp 


Pro 


Asp Ala Gly Ala 


1300 




1305 


Gly Gin Ala 


Pro 


Asp Pro Ser Pro 


1315 




1320 


Ala Asp Ser 


Gin 


Gin Pro Pro Gly 


1330 




1335 


Pro Pro Pro 


Ser 


Pro Gly Glu Glu 


1345 




1350 


Asp Thr Ser 


Gin 


Tyr Val Cys Ala 


1360 




1365 


Glu Gin Arg 


Gin 


He Asp Gly Arg 


1375 




1380 


Leu Arg Ser 


Leu 


Met Glu Ser Gly 


1390 




1395 


Val Leu lie 


Gin 


Glu Trp Phe Thr 


1405 




1410 


Leu lie Arg 


Arg 


Gin Asp Gin Leu 


1420 




1425 


Asp Leu Glu 


Arg 


Arg Phe Glu Leu 


1435 




1440 


Met Leu Ala 


He 


Glu Asp Trp Gin 


1450 




1455 


Arg Glu Gin 


Leu 


Leu Leu Glu Glu 


1465 




1470 


Arg Asp Glu 


Leu 


Val Arg Asp Leu 


1480 




1485 


Leu Glu Glu 


Asp 


Glu Arg Leu Glu 


1495 




1500 


Arg Lys Leu 


Ser 


Arg Gin Leu Ser 


1510 




1515 



Ser 
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<213> Homo sapiens 
<220> 

<221> miscjeature 

<223> Incyte ID No: 2119377CD1 

<400> 6 



£ieu 


v»xy 


Gin 


Lys 


Leu 


Ser 


Gly 


Ser Leu Lys Ser Val Glu Val Arg 


1 








5 






10 15 


VJ.U 


Fro 


Ala 


Leu 


Arg 

20 


Pro 


Ala 


Lys Arg Glu Leu Arg Gly Ala Glu 

25 30 


Pro 


Gxy 


Arg 


Pro 


Ala 
35 


Arg 


Leu 


Asp Gin Leu Leu Asp Met Pro Ala 
40 45 


AJ.a 


Gly 


Leu 


Ala 


Val 
50 


Gin 


Leu 


Arg His Ala Trp Asn Pro Glu Asp 
55 60 


Arg 


Ser 


Leu 


Asn 


Val 
65 


Phe 


Val 


Lys Asp Asp Asp Arg Leu Thr Phe 
70 .75 


nis 


Arg 


His 


Pro 


Val 
80 


Ala 


Gin 


Ser Thr Asp Gly He Arg Gly Lys 
85 90 


Val 


Gly 


His 


Ala 


Arg 
95 


Gly 


Leu 


His Ala Trp Gin He Asn Trp Pro 
100 105 


Ala 


Arg 


Gin 


Arg 


Gly 
110 


Thr 


His 


Ala Val Val Gly Val Ala Thr Ala 
115 120 


Arg 


Ala 


Pro 


Leu 


His 
125 


Ser 


Val 


Gly Tyr Thr Ala Leu Val Gly Ser 
130 135 


Asp 


Ala 


Glu 


Ser 


Trp 
140 


Gly 


Trp 


Asp Leu Gly Arg Ser Arg Leu Tyr 
145 150 


His 


Asp 


Gly 


Lys 


Asn 
155 


Gin 


Pro 


Gly Val Ala Tyr Pro Ala Phe Leu 
.160 165 


Gly 


Pro 


Asp 


Glu 


Ala 
170 


Phe 


Ala 


Leu Pro Asp Ser Leu Leu Val Val 
175 180 


Leu 


Asp 


Met 


Asp 


Glu 

185 


Gly 


Thr 


Leu Ser Phe He Val Asp Gly Gin 
190 195 


Tyr 


Leu 


Gly 


Val 


Ala 
200 


Phe 


Arg 


Gly Leu Lys Gly Lys Lys Leu Tyr 
205 210 


Pro 


Val 


Val 


Ser 


Ala 
215 


Val 


Trp 


Gly His Cys Glu Val Thr Met Arg 
220 225 


Tyr 


He 


Asn 


Gly 


Leu 
230 


Asp 


Pro 


Glu Pro Leu Pro Leu Met Asp Leu 
235 240 


Cys 


Arg 


Arg 


Ser 


He 
245 


Arg 


Ser 


Ala Leu Gly Arg Gin Arg Leu Gin 
250 255 


Asp 


He 


Ser 


Ser 


Leu 
260 


Pro 


Leu 


Pro Gin Ser Leu Lys Asn Tyr Leu 
265 270 


Gin 


Tyr 


Gin 













<210> 7 

<211> 341 

<212> PRT 

<213> Homo sapiens 

<220> . 

<221> misc^feature 

<223> Incyte ID No: 3176058CD1 

<400> 7 
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Met 


Asp 




TtAii TiAii Asn 

UBU AOXl JTXO Axy 


KaXu ser ser Lys Ftie 


He Ala 


1 






c 


1 A 

10 


15 


Glu 


Asn 


Ser 


tkTTT A^it^ VaT PliA Tie* 
njb ^ vax fAAo xxts 


A ays t^aI 

Asp ser \aiy Giy vai 


Arg Arg 










OR 


30 


Val 


Ala 


Glu 


TiOii TiOn r*oii AIa Taaq 


Aia Aia vaiy Fro Giu 


Leu Arg 












45 


Val 


VJr J. U 




irp *-iyS n±Si Lieu MIS 


Glu Leu Asn Pro Arg 


Ala Ala 










55 


60 


Asp 


Gin 


AX CI 


A±SL VSL± ASH IXp VoX 


Fne val Tnr Asp Tnr 


Leu Asn 








^R 


70 


75 


Phe 




PVio 


xrp oex vxu isxn Asp 


Glu His Lys Cys Val 


Val Arg 










85 


90 




AITC^ 


i»±y 


jjys mr lyr oer o±y 


Tyr Trp Ser Leu Cys 


Ala Ala 










100 


105 


VolJ. 


Asn 




Ala Leu Asp Glu Gly 


He Pro He Thr Ser 


Ala Ser 








110 


115 


120 


Tyx 


Tyr 


i\±a 


rnr vai Tnr Leu Asp 


Gin Val Arg Asn He 


Leu Arg 








123 


130 


135 




Asp 


IIUT 


Asp vai ber nee Fro 


Leu Val Glu Glu Arg 


His Arg 










145 


150 


lie 


Leu 


Asn 


\s±\x inr v»iy liys xie 


Leu Leu Glu Lys Phe 


Gly Gly 










160 


165 






Leu 


Asn Cys Val Ar^ Glu 


Ser Glu Asn Ser Ala 


Gin Lys 








1 /U 


175 


180 






nio 


Lieu vajL vaJL gxu ser 


Phe Pro Ser Tyr Arg 


Asp Val 








1 DR 


190 


195 




Leu 


irne 


Glu Gly Lys Arg Val 


Ser Phe Tyr Lys Arg 


Ala Gin 








200 


205 


210 


Tl o 


Leu 




Ala Asp Thr Trp Ser 


Val Leu Glu Gly Lys 


Gly Asp 








215 


220 


225 


VarJ.y 


Cys 


FJtie 


Lys Asp lie Ser Ser 


He Thr Met Phe Ala 


Asp Tyr 








230 


235 


240 




Leu 


Pro 


Gin Val Leu Ala His 


Leu Gly Ala Leu Lys 


Tyr Ser 








245 


250 


255 




Asp 


Leu 


Leu Lys Lys Leu Leu 


Lys Gly Glu Met Leu 


Ser Tyr 








260 


265 


270 




Asp 


Arcf 


vvxn Glu Val Glu lie 


Arg Gly Cys Ser Leu 


Trp Cys 








275 


280 


285 


Val 


Glu 


Leu 


lie Arg Asp Cys Leu 


Leu Glu Leu He Glu 


Gin Lys 








290 


295 


300 


Gly 


Glu 


Lys 


Pro Asn Gly Glu lie 


Asn Ser He Leu Leu 


Asp Tyr 








305 


310 


315 


Tyr 


Leu 


Trp 


Asp Tyr Ala His Asp 


His Arg Glu Asp Met 


Lys Gly 








320 


325 


330 


lie 


Pro 


Phe 


His Arg He Arg Cys 


He Tyr Ty^^ 










335 


340 





<210> 8 
<211> 341 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 2299818CD1 
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<400> 8 

Met Asn Phe Lys Leu Gly Asn Phe Ser Tyr Gin Lys Asn Pro Leu 
15 10 15 

Lys Leu Gly Glu Leu Gin Gly Asn His Phe Thr Val Val Leu Arg 
20 25 30 

Asn He Thr Gly Thr Asp Asp Gin Val Gin Gin Ala Met Asn Ser 
35 40 45 

Leu Lys Glu He Gly Phe He Asn Tyr Tyr Gly Met Gin Arg Phe 
50 55 60 

Gly Thr Thr Ala Val Pro Thr Tyr Gin Val Gly Arg Ala He Leu 
65 70 75 

Gin Asn Ser Trp Thr Glu Val Met Asp Leu He Leu Lys Pro Arg 
80 85 90 

Ser Gly Ala Glu Lys Gly Tyr Leu Val Lys Cys Arg Glu Glu Trp 
95 100 105 

Ala Lys Thr Lys Asp Pro Thr Ala Ala Leu Arg Lys Leu Pro Val 

110 115 120 

Lys Arg Cys Val Glu Gly Gin Leu Leu Arg Gly Leu Ser Lys Tyr 

125 130 135 

Gly Met Lys Asn He Val Ser Ala Phe Gly He He Pro Arg Asn 

140 145 150 

Asn Arg Leu Met Tyr He His Ser Tyr Gin Ser Tyr Val Trp Asn 

155 160 165 

Asn Met Val Ser Lys Arg He Glu Asp Tyr Gly Leu Lys Pro Val 

170 175 180 

Pro Gly Asp Leu Val Leu Lys Gly Ala Thr Ala Thr Tyr He Glu 

185 190 195 

Glu Asp Asp Val Asn Asn Tyr Ser He His Asp Val Val Met Pro 

200 205 210 

Leu Pro Gly Phe Asp Val He Tyr Pro Lys His Lys He Gin Glu 

215 220 225 

Ala Tyr Arg Glu Met Leu Thr Ala Asp Asn Leu Asp He Asp Asn 

230 235 240 

Met Arg His Lys He Arg Asp Tyr Ser Leu Ser Gly Ala Tyr Arg 

245 250 255 

Lys He He He Arg Pro Gin Asn Val Ser Trp Glu Val Val Ala 

260 265 270 

Tyr Asp Asp Pro Lys He Pro Leu Phe Asn Thr Asp Val Asp Asn 

275 280 285 

Leu Glu Gly Lys Thr Pro Pro Val Phe Ala Ser Glu Gly Lys Tyr 

290 295 300 

Arg Ala Leu Lys Met Asp Phe Ser Leu Pro Pro Ser Thr Tyr Ala 

305 310 315 

Thr Met Ala He Arg Glu Val Leu Lys Met Asp Tlir Ser He Lys 

320 325 330 

Asn Gin Thr Gin Leu Asn Thr Thr Trp Leu Arg 

335 340 

<210> 9 
<211> 1185 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 2729451CD1 
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<400> 9 





Glu 


Pro Asn 


Ser 


Leu 


Gin 


Trp 


Val 


Gly 


Ser 


Pro 


Cys 


Gly 


Leu 


1 






5 










10 










15 


nX3 


GXy 


Pro Tyr 


Jie 
20 


Pne 


Tyr 


Lys 


Ala 


Phe 

25 


Gin 


Phe 


His 


Leu 


Glu 
30 




Lys 


Pro Arg 


He 
35 


Leu 


Ser 


Leu 


Gly 


Asp 
40 


Phe 


Phe 


Phe 


Val 


Arg 
45 


Cys 


Thr 


Pro Lys 


Asp 
50 


Pro 


He 


Cys 


He 


Ala 

55 


Glu 


Leu 


Gin 


Leu 


Leu 

60 


Trp 


Glu 


Glu Arg 


Thr 
65 


Ser 


Arg 


Gin 


Leu 


Leu 
70 


Ser 


Ser 


Ser 


Lys 


Leu 
75 


Tyr 


Phe 


Leu Pro 


Glu 
80 


Asp 


Thr 


Pro 


Gin 


Gly 
85 


Arg 


Asn 


Ser 


Asp 


His 
90 


Gly 


Glu 


Asp Glu 


Val 
95 


He 


Ala 


Val 


Ser 


Glu 
100 


Lys 


Val 


He 


Val 


Lys 
105 


Leu 


Glu 


Asp Leu 


Val 
110 


Lys 


Trp 


Val 


His 


Ser 
115 


Asp 


Phe 


Ser 


Lys 


Trp 
120 


Arg 


Cys 


Gly Phe 


His 
125 


Ala 


Gly 


Pro 


Val 


Lys 
130 


Thr 


Glu 


Ala 


Leu 


Gly 
135 


Arg 


Asn 


Gly Gin 


Lys 
140 


Glu 


Ala 


Leu 


Leu 


Lys 
145 


Tyr 


Arg 


Gin 


Ser 


Thr 
150 


Leu 


Asn 


Ser Gly 


Leu 
155 


Asn 


Phe 


Lys 


Asp 


Val 
160 


Leu 


Lys 


Glu 


Lys 


Ala 
165 


Asp 


Leu 


Gly Glu 


Asp 
170 


Glu 


Glu 


Glu 


Thr 


Asn 
175 


Val 


He 


Val 


Leu 


Ser 
180 


Tyr 


Pro 


Gin Tyr 


Cys 
185 


Arg 


Tyr 


Arg 


Ser 


Met 
190 


Leu 


Lys 


Arg 


He 


Gin 
195 


Asp 


Lys 


Pro Ser 


Ser 
200 


He 


Leu 


Thr 


Asp 


Gin 
205 


Phe 


Ala 


Leu 


Ala 


Leu 
210 


Gly 


Gly 


He Ala 


Val 
215 


Val 


Ser 


Arg 


Asn 


Pro 

220 


Gin 


He 


Leu 


Tyr 


Cys 

225 


Arg 


Asp 


Thr Phe 


Asp 

230 


His 


Pro 


Thr 


Leu 


He 
235 


Glu 


Asn 


Glu 


Ser 


He 
240 


Cys 


Asp 


Glu Phe 


Ala 
245 


Pro 


Asn 


Leu 


Lys 


Gly 
250 


Arg 


Pro 


Arg 


Lys 


Lys 
255 


Lys 


Pro 


Cys Pro 


Gin 
260 


Arg 


Arg 


Asp 


Ser 


Phe 
265 


Ser 


Gly 


Val 


Lys 


Asp 
270 


Ser 


Asn 


Asn Asn 


Ser 
275 


Asp 


Gly 


Lys 


Ala 


Val 
280 


Ala 


Lys 


Val 


Lys 


Cys 
285 


Glu 


Ala 


Arg Ser 


Ala 
290 


Leu 


Thr 


Lys 


Pro 


Lys 
295 


Asn 


Asn 


His 


Asn 


Cys 
300 


Lys 


Lys 


Val Ser 


Asn 
305 


Glu 


Glu 


Lys 


Pro 


Lys 

310 


Val 


Ala 


He 


Gly 


Glu 
315 


Glu 


Cys 


Arg Ala 


Asp 
320 


Glu 


Gin 


Ala 


Phe 


Leu 
325 


Val 


Ala 


Leu 


Tyr 


Lys 
330 


Tyr 


Met 


Lys Glu 


Arg 
335 


Lys 


Thr 


Pro 


He 


Glu 
340 


Arg 


He 


Pro 


Tyr 


Leu 
345 


Gly 


Phe 


Lys Gin 


He 
350 


Asn 


Leu 


Trp 


Thr 


Met 
355 


Phe 


Gin 


Ala 


Ala 


Gin 
360 


Lys 


Leu 


Gly Gly 


Tyr 
365 


Glu 


Thr 


He 


Thr 


Ala 
370 


Arg 


Arg 


Gin 


Trp 


Lys 
375 


His 


lie 


Tyr Asp 


Glu 
380 


Leu 


Gly 


Gly 


Asn 


Pro 
385 


Gly 


Ser 


Thr 


Ser 


Ala 
390 


Ala 


Thr 


Cys Thr 


Arg 
395 


Arg 


His 


Tyr 


Glu 


Arg 

400 


Leu 


He 


Leu 


Pro 


Tyr 
405 


Glu 


Arg 


Phe He 


Lys 


Gly 


Glu 


Glu 


Asp 


Lys 


Pro 


Leu 


Pro 


Pro 


He 
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Lys Pro 
Thr Lys 
Ser Lys 
Glu Val 
Gin Lys 
Lys lie 
Arg Val 
Ser Glu 
Pro Leu 
Val Pro 
Val Asp 
Pro Glu 
Pro Leu 
Met Ala 
Gly Ser 
Leu Val 
Gly Pro 
Tyr Ser 
Lys Leu 
Pro Tyr 
Ala Arg 
Gin Ser 
His Val 
lie Asn 
His Arg 
Ser Tyr 
Leu Glu 
Ala Asp 



Arg Lys 
Val Ser 
Lys Glu 
Ser Ser 
Ser He 
Glu Gly 
Asp Pro 
Lys Val 
Pro Ser 
Gly Ala 
Ser Lys 
Ser Glu 
Ala Asn 
Asp Tyr 
Asp Asp 
Val Gin 
Met Asn 
Arg Gly 
Leu Ser 
Gly Ser 
Asp Asp 
Thr Asp 
Gin Ser 
Asp He 
Cys Ser 
Val Leu 
Lys Arg 
Phe Tyr 



410 

Gin Glu 
425 

Gly Thr 
440 

Lys Glu 
455 

Glu Gin 
470 

Pro Glu 
485 

Tyr Gin 
500 

Glu Lys 
515 

Ala Glu 
530 

Ala Pro 
545 

Ser Lys 

560 

Gin Glu 
575 

Pro Gin 
590 

Gin Asn 
605 

He Ala 
620 

He His 

635 

Ser Phe 
650 

Glu Asn 
665 

Asn Pro 
680 

Gin Val 
695 

Pro Pro 
710 

Leu Cys 

725 

His Met 
740 

Phe Arg 
755 

Phe Lys 
770 

Phe Ser 
785 

Lys Gin 
800 

Ala Leu 
815 

Ser Ser 



Asn Ser Ser 
Lys Arg He 
Asn Ala Pro 
Glu Lys Glu 
Pro Leu Pro 
Glu Phe Ser 
Asp Asn Glu 
Glu Ala Gly 
Leu Ala Pro 
Gin Pro Leu 
Ser Lys Leu 
Glu Ala Ser 
Glu Thr Glu 
Asn Cys Thr 
Asn Ala Leu 
Asp Met Phe 
His Gly Leu 
Gly He Met 
Ser Gly Ala 
Pro Leu He 
Ser Ser Leu 
Ala Val Ser 
Ser Lys Pro 
His Glu Lys 
Lys His His 
Glu He Gin 
Pro His Ser 
Pro His Leu 



415 

Gin Glu Asn 
430 

Lys His Glu 
445 

Lys Pro Gin 
460 

Gin Glu Thr 

475 

Ala Ala Asp 
490 

Ala Lys Pro 
505 

Thr Asp Gin 
520 

Glu Lys Gly 
535 

Glu Lys Asp 
550 

Thr Ser Pro 

565 

Cys Cys Phe 
580 

Phe Pro Thr 
595 

Asp Asp Lys 
610 

Val Lys Val 
625 

Lys Gin Thr 

640 

Lys Asp Lys 
655 

Asn Tyr Thr 
670 

Ser Pro Leu 
685 

Ser Leu Ser 
700 

Ser Lys Lys 
715 

Ser Gin Thr 

730 

Arg Pro Ser 
745 

Ser Glu Glu 
760 

Leu Ser Arg 
775 

Leu Asn Pro 
790 

Glu Gly Lys 
805 

His Met Pro 

820 

His Ser Leu 



420 

Glu Asn Lys 
435 

He Pro Lys 
450 

Asp Ala Ala 
465 

Leu He Ser 
480 

Met Lys Lys 
495 

Leu Ala Ser 
510 

Gly Ser His 
525 

Pro Thr Pro 
540 

Ser Ala Leu 
555 

Ser Ala Leu 

570 

Thr Glu Ser 
585 

Thr Gin Pro 
600 

Leu Pro Ala 

615 

Asp Gin Leu 
630 

Pro Lys Val 

645 

Asp Leu Thr 
660 

Pro Leu Leu 
675 

Ala Lys Lys 
690 

Ser Ser Tyr 
705 

Lys Leu He 
720 

His His Gly 

735 

Val He Gin 
750 

Arg Lys Thr 
765 

Ser Asp Pro 
780 

Leu Ala Asp 
795 

Asp Lys Leu 
810 

Ser Phe Leu 
825 

Tyr Arg His 
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830 




835 




840 


Thr Glu 


His His 


Leu 


His Asn Glu Gin 


Thr 


Ser Lys Tyr 


Pro Ser 






845 




850 




855 


Arg Asp 


Met Tyr 


Arg 


Glu Ser Glu Asn 


Ser 


Ser Phe Pro 


Ser His 






660 




865 




870 


Arg His 


Gin Glu 


Lys 


Leu His Val Asn 


Tyr 


Leu Thr Ser 


Leu His 






875 




880 




885 


Leu Gin 


Asp Lys 


Lys 


Ser Ala Ala Ala 


Glu 


Ala Pro Thr 


Asp Asp 






890 




895 




900 


Gin Pro 


Thr Asp 


Leu 


Ser Leu Pro Lys 


Asn 


Pro His Lys 


Pro Thr 






905 




910 




915 


Gly Lys 


Val Leu 


Gly 


Leu Ala His Ser 


Thr 


Thr Gly Pro 


Gin Glu 






920 




925 




930 


Ser Lys 


Gly He 


Ser 


Gin Phe Gin Val 


Leu 


Gly Ser Gin 


Ser Arg 


Asp Cys 




935 




940 




945 


His Pro 


Lys 


Ala Cys Arg Val 


Ser 


Pro Met Thr 


Met Ser 






950 




955 




960 


Gly Pro 


Lys Lys 


Tyr 


Pro Glu Ser Leu 


Ser 


Arg Ser Gly 


Lys Pro 






965 




970 




975 


His His 


Val Arg 


Leu 


Glu Asn Phe Arg 


Lys 


Met Glu Gly 


Met Val 






980 




985 




990 


His Pro 


He Leu 


His 


Arg Lys Met Ser 


Pro 


Gin Asn He 


Gly Ala 






995 




1000 




1005 


Ala Arg 


Pro He 


Lys 


Arg Ser Leu Glu Asp 


Leu Asp Leu 


Val He 




1010 




1015 




1020 


Ala Gly 


Lys Lys 


Ala 


Arg Ala Val Ser 


Pro 


Leu Asp Pro 


Ser Lys 




1025 




1030 




1035 


Glu Val 


Ser Gly Lys 


Glu Lys Ala Ser Glu 


Gin Glu Ser 


Glu Gly 




1040 




1045 




1050 


Ser Lys 


Ala Ala 


His 


Gly Gly His Ser Gly 


Gly Gly Ser 


Glu Gly 




1055 


1060 




1065 


His Lys 


Leu Pro 


Leu 


Ser Ser Pro He 


Phe 


Pro Gly Leu 


Tyr Ser 






1070 




1075 




1080 


Gly Ser 


Leu Cys 


Asn 


Ser Gly Leu Asn Ser 


Arg Leu Pro 


Ala Gly 




1085 


1090 




1095 


Tyr Ser 


His Ser 


Leu 


Gin Tyr Leu Lys Asn 


Gin Thr Val 


Leu Ser 






1100 




1105 




1110 


Pro Leu 


Met Gin 


Pro 


Leu Ala Phe His 


Ser 


Leu Val Met 


Gin Arg 






1115 




1120 




1125 


Gly He 


Phe Thr 


Ser 


Pro Thr Asn Ser 


Gin 


Gin Leu Tyr 


Arg His 






1130 




1135 




1140 


Leu Ala 


Ala Ala 


Thr 


Pro Val Gly Ser Ser 


TVr Gly Asp 


Leu Leu 






1145 




1150 




1155 


His Asn 


Ser He 


Tyr 


Pro Leu Ala Ala 


He 


Asn Pro Gin 


Ala Ala 




1160 




1165 




1170 


Phe Pro 


Ser Ser 


Gin 


Leu Ser Ser Val 


His 


Pro Ser Thr 


Lys Leu 




1175 


1180 




1185 



<21Q> 10 
<211> 1042 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> 2nisc_f eature 
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<223> Incyte ID No: 878534CD1 
<400> 10 

Met Ala Ala Met Ala Pro Ala Leu Thr Asp Ala Ala Ala Glu Ala 
15 10 15 

His His lie Arg Phe Lys Leu Ala Pro Pro Ser Ser Thr Leu Ser 
20 25 30 

Pro Gly Ser Ala Glu Asn Asn Gly Asn Ala Asn lie Leu lie Ala 
35 40 45 

Ala Asn Gly Thr Lys Arg Lys Ala He Ala Ala Glu Asp Pro Ser 
50 55 60 

Leu Asp Phe Arg Asn Asn Pro Thr Lys Glu Asp Leu Gly Lys Leu 
65 70 75 

Gin Pro Leu Val Ala Ser Tyr Leu Cys Ser Asp Val Thr Ser Val 
80 85 90 

Pro Ser Lys Glu Ser Leu Lys Leu Gin Gly Val Phe Ser Lys Gin 
95 100 105 

Thr Val Leu Lys Ser His Pro Leu Leu Ser Gin Ser Tyr Glu Leu 

110 115 120 

Arg Ala Glu Leu Leu Gly Arg Gin Pro Val Leu Glu Phe Ser Leu 

125 130 135 

Glu Asn Leu Arg Thr Met Asn Thr Ser Gly Gin Thr Ala Leu Pro 

140 145 150 

Gin Ala Pro Val Asn Gly Leu Ala Lys Lys Leu Thr Lys Ser Ser 

155 160 165 

Thr His Ser Asp His Asp Asn Ser Thr Ser Leu Asn Gly Gly Lys 

170 175 180 

Arg Ala Leu Thr Ser Ser Ala Leu His Gly Gly Glu Met Gly Gly 

185 190 195 

Ser Glu Ser Gly Asp Leu Lys Gly Gly Met Thr Asn Cys Thr Leu 

200 205 210 

Pro His Arg Ser Leu Asp Val Glu His Thr He Leu Tyr Ser Asn 

215 220 225 

Asn Ser Thr Ala Asn Lys Ser Ser Val Asn Ser Met Glu Gin Pro 

230 235 240 

Ala Leu Gin Gly Ser Ser Arg Leu Ser Pro Gly Thr Asp Ser Ser 

245 250 255 

Ser Asn Leu Gly Gly Val Lys Leu Glu Gly Lys Lys Ser Pro Leu 

260 265 270 

Ser Ser He Leu Phe Ser Ala Leu Asp Ser Asp Thr Arg He Thr 

275 280 285 

Ala Leu Leu Arg Arg Gin Ala Asp He Glu Ser Arg Ala Arg Arg 

290 295 300 

Leu Gin Lys Arg Leu Gin Val Val Gin Ala Lys Gin Val Glu Arg 

305 310 315 

His He Gin His Gin Leu Gly Gly Phe Leu Glu Lys Thr Leu Ser 

320 325 330 

Lys Leu Pro Asn Leu Glu Ser Leu Arg Pro Arg Ser Gin Leu Met 

335 340 345 

Leu Thr Arg Lys Ala Glu Ala Ala Leu Arg Lys Ala Ala Ser Glu 

350 355 360 

Thr Thr Thr Ser Glu Gly Leu Ser Asn Phe Leu Lys Ser Asn Ser 

365 370 375 

He Ser Glu Glu Leu Glu Arg Phe Thr Ala Ser Gly He Ala Asn 

380 385 390 

Leu Arg Cys Ser Glu Gin Ala Phe Asp Ser Asp Val Thr Asp Ser 
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395 400 405 

Ser Ser Gly Gly Glu Ser Asp lie Glu Glu Glu Glu Leu Thr Arg 

410 415 420 

Ala Asp Pro Glu Gin Arg His Val Pro Leu Arg Arg Arg Ser Glu 

425 430 435 

Trp Lys Trp Ala Ala Asp Arg Ala Ala He Val Ser Arg Trp Asn 

440 445 450 

Trp Leu Gin Ala His Val Ser Asp Leu Glu Tyr Arg He Arg Gin 

455 460 465 

Gin Thr Asp He Tyr Lys Gin He Arg Ala Asn Lys Gly Leu He 

470 475 480 

Val Leu Gly Glu Val Pro Pro Pro Glu His Thr Thr Asp Leu Phe 

485 490 495 

Leu Pro Leu Ser Ser Glu Val Lys Thr Asp His Gly Thr Asp Lys 

500 505 510 

Leu He Glu Ser Val Ser Gin Pro Leu Glu Asn His Gly Ala Pro 

515 520 525 

He He Gly His He Ser Glu Ser Leu Ser Thr Lys Ser Cys Gly 

530 535 540 

Ala Leu Arg Pro Val Asn Gly Val He Asn Thr Leu Gin Pro Val 

545 550 555 

Leu Ala Asp His He Pro Gly Asp Ser Ser Asp Ala Glu Glu Gin 

560 565 570 

Leu His Lys Lys Gin Arg Leu Asn Leu Val Ser Ser Ser Ser Asp 

575 580 585 

Gly Thr Cys Val Ala Ala Arg Thr Arg Pro Val Leu Ser Cys Lys 

590 595 600 

Lys Arg Arg Leu Val Arg Pro Asn Ser He Val Pro Leu Ser Lys 

605 610 615 

Lys Val His Arg Asn Ser Thr He Arg Pro Gly Cys Asp Val Asn 

620 625 630 

Pro Ser* Cys Ala Leu Cys Gly Ser Gly Ser He Asn Thr Met Pro 

635 640 645 

Pro Glu He His Tyr Glu Ala Pro Leu Leu Glu Arg Leu Ser Gin 

650 655 660 

Leu Asp Ser Cys Val His Pro Val Leu Ala Phe Pro Asp Asp Val 

665 670 675 

Pro Thr Ser Leu His Phe Gin Ser Met Leu Lys Ser Gin Trp Gin 

680 685 690 

Asn Lys Pro Phe Asp Lys He Lys Pro Pro Lys Lys Leu Ser Leu 

695 700 705 

Lys His Arg Ala Pro Met Pro Gly Ser Leu Pro Asp Ser Ala Arg 

710 715 720 

Lys Asp Arg His Lys Leu Val Ser Ser Phe Leu Thr Thr Ala Met 

725 730 735 

Leu Lys His His Thr Asp Met Ser Ser Ser Ser Tyr Leu Ala Ala 

740 745 750 

Thr His His Pro Pro His Ser Pro Leu Val Arg Gin Leu Ser Thr 

755 760 765 

Ser Ser Asp Ser Pro Ala Pro Ala Ser Ser Ser Ser Gin Val Thr 

770 775 780 

jaa Ser Thr Ser Gin Gin Pro Val Arg Arg Arg Arg Gly Glu Ser 

785 790 795 

Ser Phe Asp He Asn Asn He Val He Pro Met Ser Val Ala Ala 

800 805 810 

Thr Thr Arg Val Glu Lys Leu Gin Tyr Lys Glu He Leu Thr Pro 
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815 

Ser Trp Arg Glu Val Asp 
830 

Glu Glu Asn Glu Glu He 
845 

Ala Leu His Ala Lys Cys 
860 

Trp Thr Thr Ser Val Pro 
875 

Arg Ser Ser Asp Gly Arg 
890 

Pro Ser Thr Pro Gin Pro 
905 

Ser Leu Ser Glu Tyr Ser 
920 

Ser Pro Glu Leu His Ser 
935 

Thr Leu Arg His Leu Ala 
950 

Glu Leu Gly Leu Asp Glu 

965 

Thr Phe Pro Leu Ala His 
980 

Leu Asp Ala Gin Glu Arg 
995 

Gly Ser Lys Thr Gly Arg 

1010 

Pro He Val Pro Leu Lys 
1025 

Ala Gin Arg Pro Thr His 
1040 

<210> 11 
<211> 86 
<212> PRT 
<213> Homo sapiens 

<220> 

<221> misc„feature 
<223> Incyte ID No: 2806157CD1 

<400> 11 

Met Pro Lys Cys Gly Gly Val Arg Val Trp He Lys Asp Tirp Asn 
15 10 15 

Val Ala Ser Leu Cys Pro Trp Trp Lys Gly Pro Gin Thr Val Val 

20 25 30 

Leu He Thr Pro Thr Ala Val Asn Val Glu Arg He Leu Ala Trp 

35 40 45 

He His His Asn Arg Val Lys Pro Ala Ala Pro Glu Ser Trp Glu 

50 55 60 

Ala Arg Pro Ser Leu Asp Asn Pro Cys Arg Val Thr Leu Lys Lys 

65 70 75 

Met Thr Ser Pro Ala Pro Val Thr Pro Arg Ser 

80 85 

<210> 12 
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820 






825 


Leu 


Gin Ser 


Leu 


Lys Gly Ser 


Pro 


Asp 






835 






840 


Glu 


Asp Leu 


Ser 


Asp Ala Ala 


Phe 


Ala 






850 






855 


Glu 


Glu Met 


Glu 


Arg Ala Arg 


Trp 


Leu 






865 






870 


Pro 


Gin Arg 


Arg 


Gly Ser Arg 


Ser 


Tyr 






880 






885 


Thr 


Thr Pro 


Gin 


Leu Gly- Ser 


Ala 


Asn 






895 






900 


Ala 


Ser Pro 


Asp 


Val Ser Ser 


Ser 


His 






910 






915 


His 


Gly Gin 


Ser 


Pro Arg Ser 


Pro 


He 






925 






930 


7U.a 


Pro Leu 


Thr 


Pro Val Ala 


Arg 


Asp 






940 






945 


Ser 


Glu Tlsp 


Thr 


Arg Cys Ser 


Thr 


Pro 






955 






960 


Gin 


Ser Val 


Gin 


Pro Trp Glu 


Arg 


Arg 






970 






975 


Ser 


Pro Gin 


Ala 


Glu Cys Glu 


Asp 


Gin 






985 






990 


Ala 


Ala Arg 


Cys 


Thr Arg Arg 


Thr 


Ser 




1000 




1005 


Glu 


Thr Glu Ala 


Ala Pro Thr 


Ser 


Pro 




1015 




1020 


Ser 


Arg His Leu 


Val Ala Ala 


Ala 


Thr 




1030 




1035 



Arg 
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<211> 138 
<212> PRT 
<213> Homo sapiens 

<220> 

<221> inisc_feature 

<223> Incyte ID No: 5883626CD1 

<400> 12 

Met Lys.Met Met Val Val Leu Leu Met Leu Ser Ser Leu Ser Arg 
15 10 15 

Leu Leu Gly Leu Met Arg Pro Ser Ser Leu Arg Gin Tyr Leu Asp 

20 25 30 

Ser Val Pro Leu Pro Pro Cys Gin Glu Gin Gin Pro Lys Ala Ser 

35 40 45 

Ala Glu Leu Asp His Lys Ala Cys Tyr Leu Cys His Ser Leu Leu 

50 55 60 

Met Leu Ala Gly Val Val Val Ser Cys Gin Asp lie Thr Pro Asp 

65 70 75 

Gin Trp Gly Glu Leu Gin Leu Leu Cys Met Gin Leu Asp Arg His 

80 85 90 

lie Ser Thr Gin lie Arg Glu Ser Pro Gin Ala Met His Arg Thr 

95 100 105 

Met Leu Lys Asp Leu Ala Thr Gin Thr Tyr lie Arg Trp Gin Glu 
110 115 120 

Leu Leu Thr His Cys Gin Pro Gin Ala Gin Tyr Phe Ser Pro Trp 
125 130 135 

Lys Asp lie 



<210> 13 

<211> 805 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feat\are 

<223> Incyte ID No: 2674016CD1 

<400> 13 

Met Trp. Asp Gin Gly Gly Gin Pro Trp Gin Gin Trp Pro Leu Asn 
15 10 15 

Gin Gin Gin Trp Met Gin Ser Phe Gin His Gin Gin Asp Pro Ser 

20 25 30 

Gin lie Asp Trp Ala Ala Leu Ala Gin Ala Trp lie Ala Gin Arg 

35 40 45 

Glu Ala Ser Gly Gin Gin Ser Met Val Glu Gin Pro Pro Gly Met 

50 55 60 

Met Pro Asn Gly Gin Asp Met Ser Thr Met Glu Ser Gly Pro Asn 

65 70 75 

Asn His Gly Asn Phe Gin Gly Asp Ser Asn Phe Asn Arg Met Trp 

80 85 90 

Gin Pro Glu Trp Gly Met His Gin Gin Pro Pro His Pro Pro Pro 

95 100 105 

Asp Gin Pro Trp Met Pro Pro Thr Pro Gly Pro Met Asp lie Val 
110 115 120 
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Pro Pro Ser 
Pro Asp Asn 
Gly Pro Pro 
Tyr Gin His 
Pro Pro Tyr 
Gin Asn Arg 
Ser Pro lie 
Ala Val Lys 
Glu Lys Met 
Met Glu Gin 
Thr Glu Asp 
Ser Lys Phe 
Glu Ala Ala 
Pro Gin Glu 
Glu Tyr Gin 
Leu Leu Asp 
Ala His Arg 
Ser Ser Ala 
Tyr Gly Ser 
Glu Ser Ser 
Gin Lys Gin 
Leu His Asp 
Val Thr Lys 
Leu Ser Leu 
Glu Lys Lys 
Lys Lys Glu 
Gly Ser Ser 
Ser Thr Ser 



Glu Asp Ser 

125 
Arg His He 

140 
Asp Asn Phe 

155 
Gly Ala Ala 

170 
Trp Gin Pro 

185 
Arg Glu Arg 

200 
Ala Leu Pro 

215 
Arg Arg Thr 

230 
Glu Arg Glu 

245 
Gin Arg Ser 

260 
Ala Glu Gly 

275 
Asp Ser Asp 

290 
Ser Ser Gly 

305 
Glu His Ser 

320 
Met Met Leu 

335 
Val Thr Asp 

350 
Lys Ala Thr 

365 
Leu Ala Ser 

380 
Gly Asp Ser 

395 
Asp Thr Asp 

410 
Glu Ala Phe 

425 
Lys Gin Met 

440 
Glu Met Asn 

455 
Leu Glu Ala 

470 
Arg Thr Pro 

485 
His Lys Glu 

500 
Ser Ser Gly 

515 
Ser Thr Val 

530 



Asn Ser Gin 
Phe Asn Gin 
Ala Val Gly 
Phe Gly Pro 
Gly Pro Pro 
Pro Ser Ser 
Val Lys Gin 
Leu Pro Ala 
Lys Gin Lys 
Gin Leu Ser 
Gly Asp Gly 
Glu Glu Glu 
Lys Val Thr 
Asp Pro Glu 
Leu Thr Lys 
Glu Glu He 
Lys Ala Pro 
Leu Thr Gly 
Glu Asp Glu 
Asp Glu Glu 
Trp Arg Lys 
Glu Glu Glu 
Glu Phe He 
Arg Glu Ala 
Asn Glu Thr 
Lys Glu Lys 
Ser Ser Ser 
Ser Ser Ser 



Asp Ser Gly 
130 

Asn Asn His 

145 

Pro Val Asn 
160 

Pro Gin Gly 
175 

Gly Pro Pro 
190 

Phe Arg Asp 
205 

Glu Pro Pro 

220 

Trp He Arg 
235 

Lys Leu Glu 
250 

Lys Lys Glu 
265 

Pro Arg Leu 
280 

Glu Asp Thr 
295 

Arg Ser Pro 

310 

Met Thr Glu 
325 

Met Leu Leu 
340 

Tyr Tyr Val 
355 

Ala Lys Gin 
370 

Leu Gly Gly 
385 

Arg Ser Asp 
400 

Leu Arg His 
415 

Glu Lys Glu 
430 

Lys Gin Gin 
445 

His Lys Glu 
460 

Asp Gly Asp 

475 

Thr Ser Val 
490 

Gin Gly Arg 
505 

Ser Asn Ser 
520 

Ser Tyr Ser 
535 



Glu Phe Ala 
135 

Asn Phe Gly 

150 

Gin Phe Asp 
165 

Gly Phe His 
180 

Ala Pro Pro 
195 

Arg Gin Arg 
210 

Gin He Asp 
225 

Glu Gly Leu 
240 

Lys Glu Arg 
255 

Lys Lys Ala 
270 

Pro Gin Arg 
285 

Glu Asn Val 
300 

Ser Pro Val 

315 

Glu Glu Lys 
330 

Thr Glu He 
345 

Ala Lys Asp 
360 

Leu Ala Gin 
375 

Leu Gly Gly 
390 

Arg Gly Ser 

405 

Arg He Arg 
420 

Gin Gin Leu 
435 

Thr Glu Arg 
450 

Gin Asn Ser 
465 

Val Val Asn 
480 

Leu Glu Pro 
495 

Ser Arg Ser 
510 

Arg Thr Ser 
525 

Ser Ser Ser 
540 
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Glv 


Ser 


Ser Arg 


xni. 


oer 


Ser 


Arg oer 


Ser 


Ser Pro Lys Arg 


Lys 
















c c n 




555 


Xjys 


AT*cr 
iuy 




Arg 




Arg 


Ser Pro 


Thr 


xxe Lys Ala Arg 


Arg 




















570 


Sgit 


Arg 




Ser 


Tyr 


oer 


Arg Arg 


xxe 


Lys lie Glu Ser 


Asn 








D / 3 








con 




585 


AX. y 


AJLCl 


Attt \Ta ^ 
/iiy vaJ. 


Lys 


xxe 


Arg 


Asp Arg 


Arg 


Arg Ser Asn Arg 


Asn 
















C Q C 




600 










Arg 


Arg 


Arg Asxi 


Arg 


Ser Pro Ser Arg 


Glu 
















610 




615 




Arg 


Arg Ser 


Arg 


oer 


Arg 


Ser Arg 


Asp 


Arg Arg Thr Asn 


Arg 
















625 




630 




Ser 


Arg oer 


Arg 


Ser 


Arg 


Asp Arg 


Arg 


Lys lie Asp Asp 


Gin 
















640 




645 


Arg 




ash lieu 


Caw- 

ber 


Giy 


Asn 


Ser His 


Lys 


His Lys Gly Glu 


Ala 
















655 




660 


Lys 


f:n 11 




Arg 


Lys 


Lys 


Glu Arg 


Ser 


Arg Ser lie Asp 


Lys 
















670 




675 




Arg 


ijys jLiys 


Lys 


Asp 


Lys 


Glu Arg 


Glu 


Arg Glu Gin Asp 


Lys 
















o85 




690 


Arg 


Lys* 


Glu Lys 


Gxn 


Lys 


Arg 


GiU Glu 


Lys 


Asp Phe Lys Phe 


Ser 
















700 




705 






ASP ASP 


Arg 


Leu 


Lys 


Arg Lys 


Arg 


Glu Ser Glu Arg 


Thr 








710 








715 




720 


oh A 


oer 


Arg Ser 


Gly 


Ser 


xie 


Ser Val 


Lys 


lie lie Arg His 


Asp 








725 








730 




735 


oer 


Arg 


Gin Asp 


Ser 


Lys 


Lys 


Ser Thr 


Thr 


Lys Asp Ser Lys 


Lys 








740 








745 




750 


His 


Ser 


Gly Ser 


Asp 


Ser 


Ser 


Gly Arg 


Ser 


Ser Ser Glu Ser 


Pro 








755 








760 




765 


Gly 


Ser 


Ser Lys 


Glu 


Lys 


Lys 


Ala Lys 


Lys 


Pro Lys His Ser 


Arg 








770 








775 




780 


Ser 


Arg 


Ser Val 


Glu 


Lys 


Ser 


Pro Arg 


Ser 


Gly Lys Lys Ala 


Ser 








785 








790 




795 


Arg 


Lys 


His Lys 


Ser 


Lys 


Ser 


Arg Ser 


Arg 












800 








805 







<210> 14 

<211> 426 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> inisc_feature 

<223> Incyte ID No: 5994159CD1 

<400> 14 



Met Val Gly Ala 


Ala 


His Arg Ala 


Gin 


Ala 


Val Phe Thr Val 


Val 


1 


5 






10 




15 


Ser Ser Glu Leu 


Lys 


Gly Met Cys 


Phe 


His 


Leu Pro Met Arg 


Thr 




20 






25 




30 


Ala Pro Ser Val 


Ser 


Val Trp Leu 


Glu 


Thr 


Cys Pro Ala Ser 


Leu 




35 






40 




45 


Leu Ser Val Leu 


Leu 


Ala Pro Val 


Arg 


Pro 


Pro His Arg Arg 


He 




50 






55 




60 


Ala Val Leu Val 


Phe 


Gin Ala Asp Gly 


Ser 


Val Ser Cys Lys 


Arg 
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65 




Thr Asp 


Cys Val 


Asp 


Ser Cys Pro 






80 




61zi Cys 


Cys Pro 


ASp 


Cys Ser Ala 






95 




1x6 Pile 


Tyr Asn 


Asn 


Glu Thr Phe 






110 




Leu Ser 


Cys lie 


Cys 


Leu Leu Gly 






125 




Asp Cys 


Pro lie 


Thr 


Cys Thr Tyr 






140 




Cys Cys 


Pro Val 


Cys 


Arg Asp Cys 






155 




Ala Asn 


Gly Gin 


Val 


Phe Thr Leu 






170 




Cys Thr 


Cys Gin 


Leu 


Gly Glu Val 






185 




Gin Arg 


Ala Cys 


Ala 


Asp Pro Ala 






200 




Ser Ser 


Cys Pro 


Asp 


Ser Leu Ser 






215 




Leu Ser 


Pro His 


Gly 


Asn Val Ala 






230 




Leu His 


Gly Asp 


Thr 


Glu Ala Pro 






245 




Gly Pro 


Pro Thr 


Ala 


Ser Pro Ser 






260 




Gin Leu 


Leu Leu 


Arg 


Thr Asn Leu 






275 




Thr Ser 


Pro Ala 


Gly 


Ala His Gly 






290 




Leu Thr 


Ala Thr 


Phe 


Pro Gly Glu 






305 




Ser Pro 


Gly Pro 


Ser 


Thr Pro Pro 






320 




Ala Ser 


Pro Gly 


Ala 


Pro Gin Pro 






335 




Ser Phe 


Ser Ala 


Ser 


Gly Ala Gin 






350 




Leu Pro 


Gly Thr 


Leu 


Leu Thr Glu 






365 




Asp Pro 


Ser Pro 


Ser 


Lys Thr Pro 






380 




Val Leu 


Ser Pro 


Thr 


Thr Ser Arg 






395 




Thr Thr 


.His Pro 


Gly 


Pro Gin Gin 






410 




Gly Glu 


Glu Ser 


Thr 


Met 






425 





<210> 15 
<211> 267 
<212> PRT 

<213> Homo sapiens 
<220> . 



70 




75 


His Pro 


He 


Arg He Pro Gly 


85 




90 


Gly Cys 


Thr 


Tyr Thr Gly Arg 


1 AA 

100 




105 


Pro Ser 


Val 


Leu Asp Pro Cys 


115 




120 


Ser Val 


Ala 


Cys Ser Pro Val 


130 




135 


Pro Phe 


His 


Pro Asp Gly Glu 


145 




150 


Asn Tyr 


Glu 


Gly Arg Lys Val 


160 




165 


Asp Asp 


Glu 


Pro Cys Thr Arg 


175 




180 


Ser Cys 


Glu 


Lys Val Pro Cys 


190 




195 


Leu Leu 


Pro 


Gly Asp Cys Cys 


205 




210 


Pro Leu 


Glu 


Glu Lys Gin Gly 


220 




225 


Phe Ser 


Lys 


Ala Gly Arg Ser 


235 




240 


Val Asn 


Cys 


Ser Ser Cys Pro 


250 




255 


Arg Pro 


Val 


Leu His Leu Leu 


• 265 




270 


Met Lys 


Thr 


Gin Thr Leu Pro 


280 




285 


Pro His 


Ser 


Leu Ala Leu Gly 


295 




300 


Pro Gly 


Ala 


Ser Pro Arg Leu 


310 




315 


Gly Ala 


Pro 


Thr Leu Pro Leu 


325 




330 


Pro Pro 


Val 


Thr Pro Glu Arg 


340 




345 


He Val 


Ser 


Arg Trp Pro Pro 


355 




360 


Ala Ser 


Ala 


Leu Ser Met Met 


370 




375 


He Thr 


Leu 


Leu Gly Pro Arg 


385 




390 


Leu Ser 


Thr 


Ala Leu Ala Ala 


400 




405 


Pro Pro 


Val 


Gly Ala Ser Arg 


415 




420 
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<221> mis cofeature 

<223> Incyte ID No: 2457335CD1 

<400> 15 



wee Tyr Leu 


Arg Arg 


7\ 1 M 

Ala 


Val 


Ser Lys Thr 


Leu Ala Leu Pro 


Leu 


± 


5 






10 




15 


Arg Ala Pro 


Pro Asn 


Pro 


Ala 


Pro Leu Gly 


Lys Asp Ala Ser 


Leu 




20 






25 




30 


Arg Arg Met 


Ser Ser 


Asn 


Arg 


Phe Pro Gly 


Ser Ser Gly Ser 


Asn 




35 






40 




45 


Met lie Tyr 


Tyr Leu 


Val 


Val 


Gly Val Thr 


Val Ser Ala Gly 


Gly 




50 






55 




60 


Tyr Tyr Ala 


Tyr Lys 


Thr 


Val 


Thr Ser Asp 


Gin Ala Lys His 


Thr 




65 






70 




75 


Glu His Lys 


Thr Asn 


Leu 


Lys 


Glu Lys Thr 


Lys Ala Glu He 


His 




80 






85 




90 


Pro Phe Gin 


Gly Glu 


Lys 


Glu 


Asn Val Ala 


Glu Thr Glu Lys 


Ala 




95 






100 




105 


Ser Ser Glu 


Ala Pro 


Glu 


Glu 


Leu lie Val 


Glu Ala Glu Val 


Val 




110 






115 




120 


Asp Ala Glu 


Glu Ser 


Pro 


Ser 


Ma Thr Val 


Val Val He Lys 


Glu 




125 






130 




135 


Ala Ser Ala 


Cys Pro 


Gly 


His 


Val Glu Ala 


Ala Pro Glu Thr 


Thr 




140 






145 




150 


Ala Val Ser 


Ala Glu 


Thr 


Gly 


Pro Glu Val 


Thr Asp Ala Ala 


Ala 




155 






160 




165 


Arg Glu Thr 


Thr Glu 


Val 


Asn 


Pro Glu Thr 


Thr Pro Glu Val 


Thr 




170 






175 




J. o u 


Asn Ala Ala 


Leu Asp 


Glu 


Ala 


Val Thr He 


Asp Asn Asp Lys 


Asp 




185 






190 




195 


Thr Thr Lys 


Asn Glu 


Thr 


Ser 


Asp Glu Tyr 


Ala Glu Leu Glu 


Glu 




200 






205 




210 


Glu Asn Ser 


Pro Ala 


Glu 


Ser 


Glu S&T Ser 
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215 






220 




225 


Gin Glu Glu 


Ala Ser 


Val 


Gly 


Ser Glu Ala 


Ala Ser Ala Gin 


Gly 




230 






235 




240 


Asn Leu Gin 


Pro Val 


Asp 


lie 


Ser Ala Thr 


Asn Ala^ Tl© Glv 

f&'JUVl ^iJi^ y^jmj 






245 






250 




255 


Leu lie Ser 


Ala Leu 


Val 


Phe 


Leu Val His 


Leu Val 






260 






265 
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Met Glu Gly Ala Gly 


Glu 


Asn 


Ala Pro Glu 


Ser Ser Ser Ser 


Ala 


1 


5 






10 




15 


Pro Gly Ser Glu Glu 


Ser Ala Arg Asp Pro 


Gin Val Pro Pro 


Pro 




20 






25 




30 


Glu Glu Glu 


Ser Gly 


Asp Cys Ala Arg Ser 


Leu Glu Ala Val 


Pro 
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35 40 45 

Lys Lys Leu Cys Gly Tyr Leu Ser Lys Phe Gly Gly Lys Gly Pro 
50 55 60 

lie Arg Gly Trp Lys Ser Arg Trp Phe Phe Tyr Asp Glu Arg Lys 
65 70 75 

Cys Gin Leu Tyr Tyr Ser Arg Thr Ala Gin Asp Ala Asn Pro Leu 
80 85 90 

Asp Ser lie Asp Leu Ser Ser Ala Val Phe Asp Cys Lys Ala Asp 
95 100 105 

Ala Glu Glu Gly lie Phe Glu lie Lys Thr Pro Ser Arg Val lie 

110 115 120 

Thr Leu Lys Ala Ala Thr Lys Gin Ala Met Leu Tyr Trp Leu Gin 

125 130 135 

Gin Leu Gin Met Lys Arg Trp Glu Phe His Asn Ser Pro Pro Ala 

140 145 150 

Pro Pro Ala Thr Pro Asp Ala Ala Leu Ala Gly Asn Gly Pro Val 

155 160 165 

Leu His Leu Glu Leu Gly Gin Glu Glu Ala Glu Leu Glu Glu Phe 

170 175 180 

Leu Cys Pro Val Lys Thr Pro Pro Gly Leu Val Gly Val Ala Ala 

185 190 195 

Ala Leu Gin Pro Phe Pro Ala Leu Gin Asn lie Ser Leu Lys His 

200 205 210 

Leu Gly Thr Glu lie Gin Asn Thr Met His Asn lie Arg Gly Asn 

215 220 225 

Lys Gin Ala Gin Gly Thr Gly His Glu Pro Pro Gly Glu Asp Ser 

230 235 240 

Thr Gin Ser Gly Glu Pro Gin Arg Glu Glu Gin Pro Ser Ala Ser 

245 250 255 

Asp Ala Ser Thr Pro Val Arg Glu Pro Glu Asp Ser Paro Lys Pro 

260 265 270 

Ala Pro Lys Pro Ser Leu Thr He Ser Phe Ala Gin Lys Ala Lys 

275 • 280 285 

Arg Gin Asn Asn Thr Phe Pro Phe Phe Ser Glu Gly He Thr Arg 

290 295 300 

Asn Arg Thr Ala Gin Glu Lys Val Ala Ala Leu Glu Gin Gin Val 

305 310 315 

Leu Met Leu Thr Lys Glu Leu Lys Ser Gin Lys Glu Leu Val Lys 

320 325 330 

He Leu His Lys Ala Leu Glu Ala Ala Gin Gin Glu Lys Arg Ala 

335 340 345 

Ser Ser Ala Tyr Leu Ala Ala Ala Glu Asp Lys Asp Arg Leu Glu 

350 355 360 

Leu Val Arg His Lys Val Arg Gin He Ala Glu Leu Gly Arg Arg 

365 370 375 

Val Glu Ala Leu Glu Gin Glu Arg Glu Ser Leu Ala His Thr Ala 

380 385 390 

Ser Leu Arg Glu Gin Gin Val Gin Glu Leu Gin Gin His Val Gin 

395 400 405 

Leu Leu Met Asp Lys Asn His Ala Glu Gin Gin Val He Cys Lys 

410 415 420 

Leu Ser Glu Lys Val Thr Gin Asp Phe Thr His Pro Pro Asp Gin 

425 430 435 

Ser Pro Leu Arg Pro Asp Ala Ala Asn Arg Asp Phe Leu Ser Gin 

440 445 450 

Gin Gly Lys He Glu His Leu Lys Asp Asp Met Glu Ala Tyr Arg 
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Leu Ala, Gly Leu 


Arg 


Arg Leu Gin 


Glu 


Ala 


Leu Gly 


Asp Glu 


Ala 




blD 






con 






c 0 c 
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Ser Glu Cys Sez* 


Glu 


Leu Leu Arg 


Gin 


Leu 


Val Gin 


Glu Ala 


Leu 




con 






Dob 






540 


oJ.Il IjTp V7XU A-Ldi 
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Glu Ala Ser 


Ser 


ASp 


Ser He 


Glu Leu 


Ser 




c ^ c 
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c c c 

555 


Pro He Sex Lys 


Tyr 


Asp Glu Tyr 


Gly 


Pne 


Leu Thr 


Val Pro 


Asp 




oo\) 






bob 






CT A 

b/O 


lyr \aXM vax gxu 


Asp 


Leu Lys Leu 


Leu 


Ala 


Lys He 


Gin Ala 


Leu 




D/O 






C 0 A 

DoO 






c 0 c 
bob 


Glu Ser Acq Ser 


HIS 


Ills Leu Leu 


Gly 


Leu 


Glu Ala 


Val Asp 


Arg 




C A A 






595 






C A A 


Pro Leu Arg Glu 


Arg 


Trp Ala Ala 


Leu 


Gly 


Asp Leu 


Val Pro 


Ser 




o05 






Ct A 
OlO 






olb 


Ala Glu Leu Lys 


Gin 


Leu Leu Arg 


Ala 


Gly 


Val Pro 


Arg Glu 


HIS 










/roc 








Arg Pro Arg Val 


Trp 


Arg Trp Leu 


Val 


rllS 


Leu Arg 


vai Gin 


nlS 
















C A C 

D4b 


Leu His Thr Pro 


Gly 


Cys Tyr Gin 


Glu 


Leu 


Leu Ser 


Arg Gly 


Gin 




f c n 






fee 

655 






C f A 

660 


Ala Arg Glu His 


Pro 


Ala Ala Arg 


Gin 


He 


Glu Leu 


Asp Leu 


Asn 




665 






670 






675 


Arg Thr Phe Pro 


Asn 


Asn Lys His 


Phe 


Thr 


Cys Pro 


Thr Ser 


Ser 




680 






685 






690 


Phe Pro Asp Lys 


Leu 


Arg Arg Val 


Leu 


Leu 


Ala Phe 


Ser Trp 


Gin 




695 






700 






705 


Asn Pro Thr He 


Gly 


Tyr Cys Gin 


Gly 


Leu 


Asn Arg 


Leu Ala 


Ala 




710 






715 






720 


He Ala Leu Leu 


Val 


Leu Glu Glu 


Glu 


Glu 


Ser Ala 


Phe Trp 


Cys 




725 






730 






735 


Leu Val Ala lie 


Val 


Glu Thr He 


Met 


Pro 


Ala Asp 


Tyr Tyr 


Cys 




740 






745 






750 


Asn Thr Leu Thr 


Ala 


Ser Gin val 


Asp 


Gin 


Arg Val 


Leu Gin 


Asp 
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755 






760 






765 


Leu Leu Ser Glu 


Lys 


Leu Pro Arg 


Leu 


Met 


Ala His 


X«eu Gly 


Gin 










775 






780 


HiS nis val ASp 


Leu 


Ser Leu Val 


Thr 


Phe 


Asn Trp 


Phe Leu 


Val 




785 






790 






795 


Val Phe Ala Asp 


Ser 


Leu He Ser 


Asn 


He 


Leu Leu 


Arg Val 


Trp 




800 






805 






810 


Asp Ala Phe Leu 


Tyr 


Glu Gly Thr 


Lys 


Val 


Val' Phe 


Arg Tyr 


Ala 




815 






820 






825 


Leu Ala" He Phe 


Lys 


Tyr Asn Glu 


Lys 


Glu 


He Leu 


Arg Leu 


Gin 




830 






835 






840 


Asn Gly Leu Glu 


He 


Tyr Gin Tyr 


Leu 


Arg 


Phe Phe 


Thr Lys 


Thr 




845 






850 






855 


He Ser Asn Ser 


Arg 


Lys Leu Met 


Asn 


He 


Ala Phe 


Asn Asp 


Met 




860 






865 






870 


Asn Pro Phe Arg 


Met 


Lys Gin Leu 


Arg 


Gin 


Leu Arg 


Met Val 


His 
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875 








880 






885 


Arg Glu Arg Leu Glu 


Ala Glu Leu Arg 


Glu 


Leu 


Glu Gin Leu 


Lys 






890 








895 






900 


Ala Glu Tyr Leu Glu 


Arg Arg Ala Ser 


Arg 


Arg 


Arg Ala Val 


Ser 






905 








910 






915 


Glu Gly Cys Ala Ser 


Glu Asp Glu Val 


Glu 


Gly 


Glu Ala 








920 








925 
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Met Trp Val 


Leu 


Leu 


Arg 


Ser 


Glv Tvr 


Pro 


Leu 


Arg lie Leu 


Leu 


1 




5 








10 






15 


Piro Leu Asrg 


Glv 


Glu 


Tm 


Met 


Gly Arg 


Arg 


Gly 


Leu Pro Arg 


Asn 






20 








25 






30 


Leu Ala. Pro 


Glv 


Pro 


Pro 




Arg Arg 


Tyr 


Arg 


Lys Glu Thr 


Leu 






35 








40 






45 


Gin Ala Leu 


Asp 


Met 


Pro 


Val 




Val 


Thr 


Ala Thr Glu 


He 






50 








55 






60 


At'CT m Tl "PVT" 


Leu 


Arg 


Gly 


His 




Pro 


Phe 


Gin Asp Gly 


His 






65 








70 
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Ala 




Ser 




Ala 


Glu 


Ser Ser Gin 


Leu 






80 








85 






90 


ijjfs vXjf V9j.n 


Ixix 


Gly 


VclX 


xxir 


inr oer 


Phe 


Ser 


Leu Phe He 


Asp 






95 








100 






105 


Lys Thr Thr 


Gly 


His 


Phe 


Leu 


Cys Met 


Thr 


Ser 


Leu Ala Glu 


Gly 






110 








115 






120 


Ser Trp Glu 


Asp 


Phe 


Gin 


Ala 


Ser Val 


Glu 


Gly 


Arg Gly Asp 


Gly 






125 








130 






135 


Ala Arg Glu 


Gly 


Phe 


Leu 


Leu 


Ser Lys 


Ala 


Pro 


Glu Phe Glu 


Asp 






140 








145 






150 


Ser Glu Glu 


Val 


Arg 


Arg 


lie 


Trp Asn 


Arg 


Ala 


lie Pro Leu 


Trp 






155 








160 






165 


Glu Leu Pro 


Asp 


Gin 


Glu 


Glu 


Val Gin 


Leu 


Ala 


Asp Thr Met 


Phe 






170 








175 






180 


Gly Leu Thr 


Lys 


val 


Thr 


Asp 


Asp Thr 


Leu 


Lys 


Arg Phe Ser 


Val 






185 








190 






195 


Arg Tyr Leu 


Arg 


Pro 


Ala 


Arg 


Ser Leu 


Val 


Phe 


Pro Trp Phe 


Ser 






200 








205 






210 


Pro Gly Gly 


Ser 


Gly 


Leu 


Arg 


Gly Leu 


Lys 


Leu 


Leu Glu Ala 


Lys 






215 








220 






225 


Cys Gin Gly 


Asp 


Gly 


Val 


Ser 


Tyr Glu 


Glu 


Thr 


Thr He Pro 


Arg 






230 








235 






240 


Pro Ser Ala 


Tyr 


His 


Asn 


Leu 


Phe Gly 


Leu 


Pro 


Leu He Ser 


Arg 






245 








250 






255 


Arg Asp. Ala 


Glu 


Val 


Val 


Leu 


Thr Ser 


Arg 


Glu 


Leu Asp Ser 


Leu 






260 








265 






270 


Ala Leu Asn 


Gin 


Ser 


Thr 


Gly 


Leu Pro 


Thr 


Leu 


Thr Leu Pro 


Arg 






275 








280 






285 
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lux 


Thr 


L^S Lieu 


Pro 


Pro 


Ala 


Lieu 


i^eu 


Pro 


Tyr Leu Glu Gin 






















300 


cue 


Arg 


Arg 


lie vai 


irne 


Trp 


Lieu 


Giy 


ASp 


Asp 


Leu Arg Ser Trp 


















jIO 




315 




i\JLa 


A J. a 


Lys Leu 


pne 


Ala 


Arg 


Lys 


Leu 


Asn 


Pro Lys Arg Cys 


















"3 O C 




330 


IT 116 


Levi 


vax 


Arg Pro 


Vjriy 


Asp 


Gin 


Gin 


Pro 


Arg 


Pro Leu Glu Ala 


















340 




345 




Asn 


«xy 


uiy irne 


Asn 


r AVI 

jjeu 


Ser 


Arg 


lie 


Leu 


Arg Thr Ala Leu 


















^3 C C 

355 




360 


Pro 


A±Si 


Trp 


nlS L>ys 


Ser 


xie 


Val 


Ser 


Pne 


Arg 


Gin Leu Arg Glu 








365 










370 




375 


vilU 


vai 


Leu 


Gly Glu 


Leu 


Ser 


Asn 


Val 


Glu 


Gin 


Ala Ala Gly Leu 








380 










385 




390 


Arg 


Trp 


Ser 


Arg Phe 


Pro 


Asp 


Leu 


Asn 


Arg 


He 


Leu Lys Gly His 








O ft c 

395 










400 




405 




Lys 


Giy 


Glu Leu 


Thr 


val 


Phe 


Thr 


Gly 


Pro 


Thr Gly Ser Gly 








410 










415 




420 




Tnr 


Tnr 


Phe XI e 


Ser 


Glu 


Tyr 


Ala 


Leu 


Asp 


Leu Cys Ser Gin 








>l o c 










430 




435 




vaj. 


Asn 


Thr Leu 


Trp 


Gly 


Ser 


pne 


Glu 


He 


Ser Asn Val Arg 


















445 




450 


Lieu 


AJ.a 


Arg 


Val Met 


Leu 


Thr 


Gin 


Phe 


Ala 


Glu 


Gly Arg Leu Glu 
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455 










460 




465 


Asp 


Gin 


Leu 


Asp Lys 


Tyr 


Asp 


His 


Trp 


Ala 


Asp 


Arg Phe Glu Asp 








470 










475 




480 


Leu 


Pro 


Leu 


Tyr Phe 


Met 


Thr 


Phe 


His 


Gly 


Gin 


Gin Ser He Arg 








485 










490 




495 


Thr 


Val 


lie 


Asp Thr 


Met 


Gin 


His 


Ala 


Val 


Tyr 


Val Tyr Asp He 








500 










505 




510 


Cys 


His 


Val 


lie lie 


Asp 


Asn 


Leu 


Gin 


Phe 


Met 


Met Gly His Glu 








515 










520 




525 


Gin 


Leu 


Ser 


Thr Asp 


Arg 


He 


Ala 


Ala 


Gin 


Asp 


Tyr He He Gly 








530 










535 




540 


Vai 


Fne 


Arg 


Lys Phe 


Ala 


Thr 


Asp 


Asn 


Asn 


Cys 


His Val Thr Leu 








545 










550 




555 


Val 


lie 


His 


Pro Arg 


Lys 


Glu 


Asp 


Asp 


Asp 


Lys 


Glu Leu Gin Thr 








560 










565 




570 


Ala 


Ser 


lie 


Phe Gly 


Ser 


Ala 


Lys 


Ala 


Ser 


Gin 


Glu Ala Asp Asn 








575 










580 




585 


va± 


Leu 


lie 


Leu Gin 


Asp 


Arg 


Lys 


Leu 


Val 


Thr 


Gly Pro Gly Lys 








590 










595 




600 


Arg 


Tyr 


Leu 


Gin Val 


Ser 


Lys 


Asn 


Arg 


Phe 


Asp 


Gly Asp Val Gly 








605 










610 




615 


vajL 


Jriie 


Pro 


Leu Glu 


Pne 


Asn 


Lys 


Asn 


Ser 


Leu 


Thr Phe Ser He 








620 










625 




630 


Pro 


Pro 


Lys 


Asn Lys 


Ala 


Arg 


Leu 


Lys 


Lys 


He 


Lys Asp Asp Thr 
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640 




645 


Gly 


Pro 


Val 


Ala Lys 


Lys 


Pro 


Ser 


Ser 


Gly 


Lys 


Lys Gly Ala Thr 








650 










655 




660 


Thr 


Gin 


Asn 


Ser Glu 


He 


Cys 


Ser 


Gly 


Gin 


Ala 


Pro Thr Pro Asp 








665 










670 




675 


Gin 


Pro 


Asp 


Thr Ser 


Lys 


Arg 


Ser 


Lys 









680 
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Thr Lys Thr 


Ala 


Leu 


Leu 


Lys Leu Phe Val Ala He 


Val He 
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10 
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Thr 


Phe lie Leu 


He 


Leu 


Pro 


Glu Tyr Phe Lys Thr Pro 


Lys Glu 
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25 


30 


Arg 


Thr Leu Glu 


Leu 


Ser 


Cys 


Leu Glu Val Cys Leu Gin 


Ser Asn 
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40 
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Phe 


Thr Tyr Ser 


Leu 


Ser 


Ser 


Leu Asn Phe Ser Phe Val 


Thr Phe 
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55 


60 


Leu 


Gin Pro Val 


Arg 


Glu 


Thr 


Gin He He Met Arg He 
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70 
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Pro Ser Asn 


Phe 


Arg 
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Phe Thr Arg Thr Cys Gin 


Asp He 






80 






85 


90 


Thr 


Val Leu He 


Arg 


Arg 


Gly 


Ser Met Glu Val Lys Ala 


Asn Asp 






95 






100 


105 


Phe 


His Ser Pro 


Cys 


Gin 


His 


Phe Asn Phe Ser Val Ala 


Pro Leu 






110 






115 


120 


Val 


Asp His Leu 


Glu 


Glu 


Tyr 


Asn Thr Thr Cyp His Leu 


Lys Asn 






125 






130 


135 


His 


Thr Gly Arg 


Ser 


Thr 


He 


Met Glu Asp Glu Pro Ser 


Lys Glu 






140 






145 


150 


Lys 


Ser He Asn 


Tyr 


Thr 


Cys 


Arg He Met Glu Tyr Pro 


Asn Asp 






155 






160 


165 


Cys 


He His lie 


Ser 


Leu 


His 


Leu Glu Met Asp He Lys 


Asn He 






170 






175 


180 


Thr 


Cys Ser Met 


Lys 


He 


Thr 


Tzp Tyr He Leu Val Leu 


Leu Val 






185 






190 


195 


Phe 


He Phe Leu 


He 


He 


Leu 


Thr He Arg Lys He lieu 


Glu Gly 






200 






' 205 


210 


Gin 


Arg Arg Val 


Gin 


Lys 


Trp 


Gin Ser His Arg Asp Lys 


Pro Thr 






215 






220 


225 


Ser 


Val Leu Leu 


Arg 


Gly 


Ser 


Asp Ser Glu Lys Leu Arg 


Ala Leu 






230 






235 


240 


Asn 


Val Gin Val 


Leu 


Ser 


Glu 


Thr Thr Gin Arg Leu Pro 


Leu Asp 






245 






250 


255 


Gin 


Val Gin Glu 


Val 


Leu 


Pro 


Pro He Pro Glu Leu 
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Met Ala Asp Asn Leu Asp Glu Phe lie Glu Glu Gin Lys Ala Arg 
15 10 15 

Leu Ala Glu Asp Lys Ala Glu Leu Glu Ser Asp Pro Pro Tyr Met 
20 25 30 

Glu Met Lys Gly Lys Leu Ser Ala Lys Leu Ser Glu Asn Ser Lys 
35 40 45 

lie Leu lie Ser Met Ala Lys Glu Asn lie Pro Pro Asn Ser Gin 
50 55 60 

Gin Thr Arg Gly Ser Leu Gly lie Asp Tyr Gly Leu Ser Leu Pro 
65 70 75 

Leu Gly Glu Asp Tyr Glu Arg Lys Lys His Lys Leu Lys Glu Glu 
80 85 90 

Leu Arg Gin Asp Tyr Arg Arg Tyr Leu Thr Gin Glu Arg Leu Lys 
95 100 105 

Leu Glu Arg Asn Lys Glu Tyr Asn Gin Phe Leu Arg Gly Lys Glu 

110 115 120 

Glu Ser Ser Glu Lys Phe Arg Gin Val Glu Lys Ser Thr Glu Pro 

125 130 135 

Lys Ser Gin Arg Asn Lys Lys Pro He Gly Gin Val Lys Pro Asp 

140 145 150 

Leu Thr Ser Gin He Gin Thr Ser Cys Glu Asn Ser Glu Gly Pro 

155 160 165 

Arg Lys Asp Val Leu Thr Pro Ser Glu Ala Tyr Glu Glu Leu Leu 

170 175 180 

Asn Gin Arg Arg Leu Glu Glu Asp Arg Tyr Arg Gin Leu Asp Asp 

185 190 195 

Glu He Glu Leu Arg Asn Arg Arg He He Lys Lys Ala Asn Glu 

200 205 210 

Glu Val Gly He Ser Asn Leu Lys His Gin Arg Phe Ala Ser Lys 

215 220 225 

Ala Gly He Pro Asp Arg Arg Phe His Arg Phe Asn Glu Asp Arg 

230 235 240 

Val Phe Asp Arg Arg Tyr His Arg Pro Asp Gin Asp Pro Glu Val 

245 250 255 

Ser Glu Glu Met Asp Glu Arg Phe Arg Tyr Glu Ser Asp Phe Asp 

260 265 270 

Arg Arg Leu Ser Arg Val Tyr Thr Asn Asp Arg Met His Arg Asn 

275 280 285 

Lys Arg Gly Asn Met Pro Pro Met Glu His Asp Gly Asp Val He 

290 295 300 

Glu Gin Ser Asn He Arg He Ser Ser Ala Glu Asn Lys Ser Ala 

305 310 315 

Pro Asp Asn Glu Thr Ser Lys Ser Ala Asn Gin Asp Thr Cys Ser 

320 325 330 

Pro Phe Ala Gly Met Leu Phe Gly Gly Glu Asp Arg Glu Leu He 

335 340 345 

Gin Arg Arg Lys Glu Lys Tyr Arg Leu Glu Leu Leu Glu Gin Met 

350 355 360 

Ala Glu Gin Gin Arg Asn Lys Arg Arg Glu Lys Asp Leu Glu Leu 

365 370 375 

Arg Val Ala Ala Ser Gly Ala Gin Asp Pro Glu Lys Ser Pro Asp 

380 385 390 

Arg Leu Lys Gin Phe Ser- Val Ala Pro Arg His Phe Glu Glu Met 

395 400 405 

He Pro Pro Glu Arg Pro Arg He Ala Phe Gin Thr Pro Leu Pro 

410 415 420 
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Pro Leu Ser Ala Pro Ser 
425 

Val Pro Ser Gin Asn Glu 
440 

Leu Gly Glu Met Val Ser 
455 

Pro Leu Leu Pro Pro Leu 
470 

Asp Ala Tyr Tyr Phe Tyr 
485 

Leu Ala Tyr Tyr Gly Ser 
500 

Tyr Val Ser Ala Pro Val 
515 

Val Ser Pro Cys His Pro 
530 

<210> 20 
<211> 312 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc^feature 
<223> Xncyte ID No: 484404CD1 

<400> 20 



Met 


Trp 


Ser 


Glu 


Gly 


Arg 
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Glu Tyr Glu Arg He Pro Arg Glu 
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10 15 


Arg 


Ala 


Pro 


Pro 


Arg 
20 


Ser 


His 


Pro Ser Asp Glu Ser Gly lyr Arg 
25 30 


Trp 


Thr 


Arg 


Asp 


Asp 

35 


His 


Ser 


Ala Ser Arg Gin Pro Glu Tyr Arg 

40 45 


Asp 


Met 


Arg 


Asp 


Gly 
50 


Phe 


Arg 


Arg. Lys Ser Phe Tyr Ser Ser His 
55 60 


Tyr 


Ala. 


Arg 


Glu 


Arg 
65 


Ser 


Pro 


Tyr Lys Arg Asp Asn Thr Phe Phe 
70 75 


Arg 


Glu 


Ser 


Pro 


Val 
80 


Gly 


Arg 


Lys Asp Ser Pro His Ser Arg Ser 
85 90 


Gly 


Ser 


Ser 


Val 


Ser 
95 


Ser 


Arg 


Ser Tyr Ser Pro Glu Arg Ser Lys 
100 105 


Ser 


Tyr 


Ser 


Phe 


His 
110 


Gin 


Ser 


Gin His Arg Lys Ser Val Arg Pro 

115 120 


Gly 


Ala 


Ser 


Tyr 


Lys 
125 


Arg 


Gin 


Asn Glu Gly Asn Pro Glu Arg Asp 
130 135 


Lys 


Glu 


Arg 


Pro 


Val 
140 


Gin 


Ser 


Leu Lys Thr Ser Arg Asp Thr Ser 
145 150 


Pro 


Ser 


Ser 


Gly 


Ser 
155 


Ala 


val 


Ser Ser Ser Lys Val Leu Asp Lys 
160 165 


Pro 


Ser 


Arg 


Leu 


Thr 
170 


Glu 


Lys 


Glu Leu Ala Glu Ala Ala Ser Lys 
175 180 


Trp 


Ala 


Ala 


Glu 


Lys 
185 


Leu 


Glu 


Lys Ser Asp Glu Ser Asn Leu Pro 
190 195 


Glu 


He 


Ser 


Glu 


Tyr 
200. 


Glu 


Ala 


Gly Ser Thr Ala Pro Leu Phe Thr 
205 210 


Asp 


Gin 


Pro 


Glu 


Glu 


Pro 


Glu 


Ser Asn Thr Thr His Gly He Glu 



Val Pro Pro He Pro Ser Val His Pro 
430 435 
Asp Leu Arg Ser Gly Leu Ser Ser Ala 
445 450 
Pro Arg He Ala Pro Leu Pro Pro Pro 
460 465 
Ala Thr Asn Tyr Arg Thr Pro Tyr Asp 
475 480 
Gly Ser Arg Asn Thr Phe Asp Pro Ser 
490 495 
Gly Met Met Gly Val Gin Pro Ala Ala 
505 510 
Thr His Gin Leu Ala Gin Pro Val Val 
520 525 
Gly Trp Ser Thr Met Leu 
535 
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AID 








225 


ijeu jriie urXu Asp 


Ser 




Leu Thr Thr Arg 


Ser Lys Ala lie 


Ala 




o o n 




235 




240 


tsGx uys Tnr uys 


Glu 


xie 


Glu Gin Val Tyr 


Arg Gin Asp Cys 


Glu 




245 




250 




255 


Thr Phe Gly Met 


Val 


Val 


Lys Met Leu He 


Glu Lys Asp Pro 


Ser 




260 




265 




270 


Leu Glu Lys Ser 


He 


Gin 


Phe Ala Leu Arg 


Gin Asn Leu His 


Glu 




275 




280 




285 


He Gly Glu Arg 


Cys 


Val 


Glu Glu Leu Lys 


His Phe He Ala 


Glu 




290 




295 




300 


Tyr Asp Thr Ser 


Thr 


Gin 


Asp Phe Gly Glu 


Pro Phe 






305 




310 







<210> 21 

<211> 1400 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 2830063CD1 

<400> 21 



Met 


Met 


Ala 


Ser 


Phe 


Gin 


Arg 


Ser Asn Ser 


His Asp Lys Val 


Arg 


1 








5 






10 




15 


Arg 


He 


Val 


Ala 


Glu 


Glu 


Gly 


Arg Thr Ala 


Arg Asn Leu He 


Ala 










20 






25 




30 


Trp 


Ser 


Val 


Pro 


Leu 


Glu 


Ser 


Lys Asp Asp 


Asp Gly Lys Pro 


Lys 










35 






40 




45 


Cys 


Gin 


Thr 


Gly 


Gly 


Lys 


Ser 


Lys Arg Thr 


He Gin Gly Thr 


His 










50 






55 




60 


Lys 


Thr 


Thr 


Lys 


Gin 


Ser 


Thr 


Ala Val Asp 


Cys Lys He Thr 


Ser 










65 






70 




75 


Ser 


Thr 


Thr 


Gly 


Asp 


Lys 


His 


Phe Asp Lys 


Ser Pro Thr Lys 


Thr 










80 






85 




90 


Arg 


His 


Pro 


Arg 


Lys 


He 


Asp 


Leu Arg Ala 


Arg Tyr Trp Ala 


Phe 










95 






100 




105 


Leu 


Phe 


Asp 


Asn 


Leu 


Arg 


Arg 


Ala Val Asp 


Glu He Tyr Val 


Thr 










110 






115 




120 


Cys 


Glu 


Ser 


Asp 


Gin 


Ser 


Val 


Val Glu Cys 


Lys Glu Val Leu 


Met 










125 






130 




135 


Met 


Leu 


Asp 


Asn 


Tyr 


val 


Arg 


Asp Phe Lys 


Ala Leu He Asp 


Trp 










140 






145 




150 


He 


Gin 


Leu 


Gin 


Glu 


Lys 


Leu 


Glu Lys Thr 


Asp Ala Gin Ser 


Arg 










155 






160 




165 


Pro 


Thr 


Ser 


Leu 


Ala 


Trp 


Glu 


Val Lys Lys 


Met Ser Pro Gly 


Arg 










170 






175 




180 


His 


Val 


He 


Pro 


Ser 


Pro 


Ser 


Thr Asp Arg 


He Asn Val Thr 


Ser 










185 






190 




195 


Asn 


Ala 


Arg 


Arg 


Ser 


Leu 


Asn 


Phe Gly Gly 


Ser Thr Gly Tlir 


Val 










200 






205 




210 


Pro 


Ala 


Pro 


Arg 


Leu 


Ala 


Pro 


Thr Gly Val 


Ser Trp Ala Asp 


Lys 










215 






220 




225 


Val 


Lys 


Ala 


His 


His 


Thr 


Gly 


Ser Thr Ala 


Ser Ser Glu He 


Thr 










230 






235 




240 
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Pro Ala Gin Ser Cys Pro Pro Met Thr Val Gin Lys Ala Ser Arg 

245 250 255 

Lys Asn Glu Arg Lys Asp Ala Glu Gly Trp Glu Thr Val Gin Arg 

260 265 270 

Gly Arg Pro He Arg Ser Arg Ser Thr Ala Val Met Pro Lys Val 

275 280 285 

Ser Leu Ala Thr Glu Ala Thr Arg Ser Lys Asp Asp Ser Asp Lys 

290 295 300 

Glu Asn Val Cys Leu Leu Pro Asp Glu Ser He Gin Lys Gly Gin 

305 310 315 

Phe Val Gly Asp Gly Thr Ser Asn Thr He Glu Ser His Pro Lys 

320 325 330 

Asp Ser Leu His Ser Cys Asp His Pro Leu Ala Glu Lys Thr Gin 

335 340 345 

Phe Thr Val Ser Thr Leu Asp Asp Val Lys Asn Ser Gly Ser He 

350 355 360 

Arg Asp Asn Tyr Val Arg Thr Ser Glu He Ser Ala Val His He 

365 370 375 

Asp Thr Glu cys Val Ser Val Met Leu Gin Ala Gly Thr Pro Pro 

380 385 390 

Leu Gin Val Asn Glu Glu Lys Phe Pro Ala Glu Lys Ala Arg He 

395 400 405 

Glu Asn Glu Met Asp Pro Ser Asp He Ser Asn Ser Met Ala Glu 

410 415 420 

Val Leu Ala Lys Lys Glu Glu Leu Ala Asp Arg Leu Glu Lys Ala 

425 430 435 

Asn Glu Glu Ala He Ala Ser Ala He Ala Glu Glu Glu Gin Leu 

440 445 450 

Thr Arg Glu He Glu Ala Glu Glu Asn Asn Asp He Asn He Glu 

455 460 465 

Thr Asp Asn Asp Ser Asp Phe Ser Ala Ser Met Gly Ser Gly Ser 

470 475 480 

Val Ser Phe Cys Gly Met Ser Met Asp Trp Asn Asp Val Leu Ala 

485 490 495 

Asp Tyr. Glu Ala Arg Glu Ser Trp Arg Gin Asn Thr Ser Trp Gly 

500 505 510 

Asp He Val Glu Glu Glu Pro Ala Arg Pro Pro Gly His Gly He 

515 520 525 

His Met His Glu Lys Leu Ser Ser Pro Ser Arg Lys Arg Thr He 

530 535 540 

Ala Glu Ser Lys Lys Lys His Glu Glu Lys Gin Met Lys Ala Gin 

545 550 555 

Gin Leu Arg Glu Lys Leu Arg Glu Glu Lys Thr Leu Lys Leu Gin 

560 565 570 

Lys Leu Leu Glu Arg Glu Lys Asp Val Arg Lys Trp Lys Glu Glu 

575 580 585 

Leu Leu Asp Gin Arg Arg Arg Met Met Glu Glu Lys Leu Leu His 

590 595 600 

Ala Glu Phe Lys Arg Glu Val Gin Leu Gin Ala He Val Lys Lys 

605 610 615 

Ala Gin Glu Glu Glu Ala Lys Val Asn Glu He Ala Phe He Asn 

620 625 630 

Thr Leu Glu Ala Gin Asn Lys Arg His Asp Val Leu Ser Lys Leu 

635 640 645 

Lys Glu Tyr Glu Gin Arg Leu Asn Glu Leu Gin Glu Glu Arg Gin 

650 655 660 
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Arg Arg Gin Glu Glu Lys Gin Ala Arg Asp Glu Ala Val Gin Glu 

665 670 675 

Arg Lys Arg Ala Leu Glu Ala Glu Arg Gin Ala Arg Val Glu Glu 

680 685 690 

Leu Leu Met Lys Arg Lys Glu Gin Glu Ala Arg lie Glu Gin Gin 

695 700 705 

Arg Gin Glu Lys Glu Lys Ala Arg Glu Asp Ala Ala Arg Glu Arg 

710 715 720 

Ala Arg Asp Arg Glu Glu Arg Leu Ala Ala Leu Thr Ala Ala Gin 

725 730 735 

Gin Glu Ala Met Glu Glu Leu Gin Lys Lys He Gin Leu Lys His 

740 745 750 

Asp Glu Ser He Arg Arg His Met Glu Gin He Glu Gin Arg Lys 

755 760 765 

Glu Lys Ala Ala Glu Leu Ser Ser Gly Arg His Ala Asn Thr Asp 

770 775 780 

Tyr Ala Pro Lys Leu Thr Pro Tyr Glu Arg Lys Lys Gin Cys Ser 

785 790 795 

Leu Cys Asn Val Leu He Ser Ser Glu Val Tyr Leu Phe Ser His 

800 805 810 

Val Lys Gly Arg Lys His Gin Gin Ala Val Arg Glu Asn Thr Ser 

815 820 825 

He Gin Gly Arg Glu Leu Ser Asp Glu Glu Val Glu His Leu Ser 

830 835 840 

Leu Lys Lys Tyr He He Asp He Val Val Glu Ser Thr Ala Pro 

845 850 855 

Ala Glu Ala Leu Lys Asp Gly Glu Glu Arg Gin Lys Asn Lys Lys 

860 865 870 

Lys Ala Lys Lys He Lys Ala Arg Met Asn Phe Arg Ala Lys Glu 

875 880 885 

Tyr Glu Ser Leu Met Glu Thr Lys Asn Ser Gly Ser Asp Ser Pro 

890 895 900 

Tyr Lys Ala Lys Leu Gin Arg Leu Ala Lys Asp Leu Leu Lys Gin 

905 910 915 

Val Gin Val Gin Asp Ser Gly Ser Trp Ala Asn Asn Lys Val Ser 

920 925 930 

Ala Leu Asp Arg Thr Leu Gly Glu He Thr Arg He Leu Glu Lys 

935 940 945 

Glu Asn Val Ala .Asp Gin He Ala Phe Gin Ala Ala Gly Gly Leu 

950 955 960 

Thr Ala Leu Glu His He Leu Gin Ala Val Val Pro Ala Thr Asn 

965 970 975 

Val Asn Thr Val Leu Arg He Pro Pro Lys Ser Leu Cys Asn Ala 

980 985 990 

He Asn Val Tyr Asn Leu Thr Cys Asn Asn Cys Ser Glu Asn Cys 

995 1000 1005 

Ser Asp Val Leu Phe Ser Asn Lys He Thr Phe Leu Met Asp Leu 
1010 1015 1020 

Leu He His Gin Leu Thr Val Tyr Val Pro Asp Glu Asn Asn Thr 
1025 1030 1035 

He Leu Gly Arg Asn Thr Asn Lys Gin Val Phe Glu Gly Leu Thr 
1040 1045 1050 

Thr Gly Leu Leu Lys Val Ser Ala Val Val Leu Gly Cys Leu He 
1055 1060 1065 

Ala Asn Arg Pro Asp Gly Asn Cys Gin Pro Ala Thr Pro Lys He 
1070 1075 1080 



37/69 



wo 02/078420 



Pro Thr Gin Glu Met Lys Asn Lys 
1085 

Asn Asn Arg Val Gin Asp Leu He 
1100 

Leu lie Asp Lys Leu Cys Ala Cys 
1115 

Val Asp Glu Asn Pro Lys Met Ala 
1130 

Gly Leu Leu His Ala Met Cys Thr 
1145 

Arg Ser Tyr Ser He Phe Asp Asn 
1160 

Leu Thr Ala Ala Leu Gin Ala Thr 
1175 

Met Leu Tyr Cys Val Leu Phe His 
1190 

Thr Ala Ser Pro Lys Glu Asn Tyr 
1205 

Ala He Gin Ser Leu Arg Phe Phe 
1220 

Leu Pro Ala Phe Gin Ser He Val 

1235 

Ala Phe. Arg His Met Ala Ser Ser 
1250 

Val Ser Cys Glu Ser Leu Leu His 

1265 

Tyr Phe Thr Val Asn His Pro Asp 
1280 

Gly Arg His Pro Thr Val Leu Gin 
1295 

Gin Tyr Phe Ser Asp Pro Arg Leu 
1310 

Leu He Ala Ala Cys Tyr Asn Asn 

1325 

Glu Gin Glu Met Ser Cys Val Leu 
1340 

Leu Ala Gin Thr Pro Gly Gin Ala 

1355 

Lys Gly Lys Cys Leu Gly Ser Gin 
1370 

Arg Phe Pro Gin Gin Ala Trp Glu 
1385 

Lys Lys Glu Lys Lys 
1400 

<210> 22 

<211> 1384 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> inisc_featiare 

<223> Incyte ID No: 7506096CD1 

<400> 22 

Met Glu Ser Ser Ser Ser Asp Tyr 



Thr Ser 


Gin Gly Asp 


Pro Phe 


1090 






1095 


Ser Tyr 


Val Val 


Asn 


Met Gly 


1105 






1110 


Phe Leu 


Ser Val 


Gin 


Gly Pro 


1120 






1125 


He Phe 


Leu Gin 


His 


Ala Ala 


1135 






1140 


Leu Cys 


Phe Ala Val 


Thr Gly 


1150 






1155 


Asn Arg 


Gin Asp Pro 


Thr Gly 


1165 






1170 


Asp Leu 


Ala Gly Val 


Leu His 


1180 






1185 


Gly Thr 


He Leu Asp 


Pro Ser 


1195 






1200 


Thr Gin 


Asn Thr 


He 


Gin Val 


1210 






1215 


Asn Ser 


Phe Ala Ala 


Leu His 


1225 






1230 


Gly Ala 


Glu Gly Leu 


Ser Leu 


1240 






1245 


Leu Leu 


Gly His Cys 


Ser Gin 


1255 






1260 


Glu Val 


He Val 


Cys 


Val Gly 


1270 






1275 


Asn Gin 


Val He 


Val 


Gin Ser 


1285 






1290 


Lys Leu 


Cys Gin 


Leu 


Pro Phe 


1300 






1305 


He Lys 


Val Leu 


Phe 


Pro Ser 


1315 






1320 


His Gin 


Asn Lys 


He 


He Leu 


1330 






1335 


Leu Ala 


Thr Phe 


He 


Gin Asp 


1345 






1350 


Glu Asn 


Gin Pro 


Tyr 


Gin Pro 


1360 






1365 


Asp Tyr 


Leu Glu 


Leu 


Ala Asn 


1375 






1380 


Glu Ala 


Arg Gin Phe 


Phe Leu 


1390 






1395 



Tyr Asn Lys Asp Asn Glu Glu 
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15 10 15 

Glu Ser Leu Leu Ala Asn Val Ala Ser Leu Arg His Glu Leu Lys 
20 25 30 

lie Thr Glu Trp Ser Leu Gin Ser Leu Gly Glu Glu Leu Ser Ser 
35 40 45 

Val Ser Pro Ser Glu Asn Ser Asp Tyr Ala Pro Asn Pro Ser Arg 
50 55 60 

Ser Glu Lys Leu lie Leu Asp Val Gin Pro Ser His Pro Gly Leu 
65 70 75 

Leu Asn Tyr Ser Pro Tyr Glu Asn Val Cys Lys lie Ser Gly Ser 
80 85 90 

Ser Thr Asp Phe Gin Lys Lys Pro Arg Asp Lys Met Phe Ser Ser 
95 100 105 

Ser Ala Pro Val Asp Gin Glu Xle Lys Ser Leu Arg Glu Lys Leu 

110 115 120 

Asn Lys Leu Arg Gin Gin Asn Ala Cys Leu Val Thr Gin Asn His 

125 130 135 

Ser Leu Met Thr Lys Phe Glu Ser lie His Phe Glu Leu Thr Gin 

140 145 150 

Ser Arg Ala Lys Val Ser Met Leu Glu Ser Ala Gin Gin Gin Ala 

155 160 165 

Ala Ser Val Pro He Leu Glu Glu Gin He He Asn Leu Glu Ala 

170 175 180 

Glu Val Ser Ala Gin Asp Lys Val Leu Arg Glu Ala Glu Asn Lys 

185 190 195 

Leu Glu Gin Ser Gin Lys Met Val Xle Glu Lys Glu Gin Ser Leu 

200 205 210 

Gin Glu. Ser Lys Glu Glu Cys He Lys Leu Lys Val Asp Leu Leu 

215 220 225 

Glu Gin Thr Lys Gin Gly Lys Arg Ala Glu Arg Gin Arg Asn Glu 

230 235 240 

Ala Leu Tyr Asn Ala Glu Glu Leu Ser Lys Ala Phe Gin Gin Tyr 

245 250 255 

Lys Lys Lys Val Ala Glu Lys Leu Glu Lys Val . Gin Ala Glu Glu 

260 265 270 

Glu He Leu Glu Arg Asn Leu Thr Asn Cys Glu Lys Glu Asn Lys 

275 280 285 

Arg Leu Gin Glu Arg Cys Gly Leu Tyr Lys Ser Glu Leu Glu He 

290 295 300 

Leu Lys Glu Lys Leu Arg Gin Leu Lys Glu Glu Asn Asn Asn Gly 

305 310 315 

Lys Glu Lys Leu Arg He Met Ala Val Lys Asn Ser Glu Val Met 

320 325 330 

Ala Gin Leu {Thr Glu Ser Arg Gin Ser He Leu Lys Leu Glu Ser 

335 340 345 

Glu Leu Glu Asn Lys Asp Glu He Leu Arg Asp Lys Phe Ser Leu 

350 355 360 

Met Asn Glu Asn Arg Glu Leu Lys Val Arg Val Ala Ala Gin Asn 

365 370 375 

Glu Arg Leu Asp Leu Cys Gin Gin Glu He Glu Ser Ser Arg Val 

380 385 390 

Glu Leu Arg Ser Leu Glu Lys Xle He Ser Gin Leu Pro Leu Lys 

395 400 405 

Arg Glu Leu Phe Gly Phe Lys Ser Tyr Leu Ser Lys Tyr Gin Met 

410 415 420 

Ser Ser Phe Ser Asn Lys Glu Asp Arg Cys He Gly Cys Cys Glu 
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Ala Asn Lys 
Lys Glu Ala 
Gin Leu Ser 
Ser Lys Leu 
His Gin Val 
Asn Lys Gin 
Glu Glu Leu 
Asp Leu Lys 
He Gin Glu 
Glu Ser Glu 
Lys Gin Leu 
Lys Asn Ala 
He Arg Ser 
Cys Leu Ala 
Lys Glu Met 
Ala Leu Glu 
Leu Glu Gly 
Ser Glu Glu 
His Ser Leu 
Thr Leu Gin 
Asn Gly Glu 
Ser Lys Leu 
Glu Lys Leu 
Glu Ala Asp 
Ala Arg Gin 
Ser Lys Met 
Asn Lys Ala 
Thr Lys Thr 



425 
Leu Val 

440 
Glu lie 

455 
Gin Ser 

470 
Ser Ser 

485 
Ala Glu 

500 
Tyr Glu 

515 
Arg Thr 

530 
Val Asn 

545 
Glu Leu 

560 
Met Thr 

575 
Glu Glu 

590 
Glu Leu 

605 
Leu Glu 

620 
Phe Glu 

635 
Glu Lys 

650 
He Cys 

665 
Asn Lys 

680 
Val Tyr 

695 
Gin Glu 

710 
Gin Gin 

725 
Leu Glu 

740 
Glu Gin 

755 
Arg Lys 

770 
Leu Lys 

785 
Val Lys 

800 
Glu Lys 

815 
Met His 

830 
Glu Leu 



He Ser 
Gin Lys 
Leu He 
Leu Glu 
Ser Val 
Lys Glu 
Lys Leu 
Met Ala 
Leu Glu 
Lys Lys 
Lys He 
Glu Gin 
Thr Asn 
Lys Ala 
Gin He 
Lys Glu 
Glu Lys 
Cys Leu 
Thr Ser 
Gin Gin 
Asp Thr 
Glu Leu 
Met Glu 
Arg Gin 
He Glu 
Glu He 
Leu Ser 
Glu Lys 



Glu Leu 
Leu His 
Thr Cys 
Thr Glu 
Lys Asp 
Arg Gin 
He Gin 
His Arg 
Lys Ala 
Cys Ser 
Val Ala 
Glu Leu 
He Asn 
Lys Lys 
Glu Arg 
Glu Leu 
Phe Glu 
Gin Lys 
Glu Gin 
Met Leu 
Gin Thr 
Gin Lys 
Glu Lys 
Lys Val 
Met Asp 
Met His 
Gin Leu 
Lys Thr 



430 

Arg He Lys 
445 

Ala Asn Leu 
460 

Asn Asp Ser 
475 

Pro Val Lys 
490 

Gin Asn Gin 

505 

Arg Leu Val 
520 

He Glu Ala 
535 

Thr Ser Gin 
550 

Ser Asn Ser 
565 

Gin Leu Leu 
580 

Tyr Ser Ser 
595 

Met Glu Lys 
610 

Thr Glu His 
625 

He His Leu 
640 

Val Arg Gin 
655 

Val Leu His 
670 

Lys Gin Leu 
685 

Glu Leu Lys 
700 

Asn Val He 
715 

Gin Gin Glu 
730 

Lys Leu Glu 
745 

Gin Arg Glu 
760 

Cys Glu Ser 
775 

He Glu Leu 
790 

Gin Tyr Lys 
805 

Leu Lys Arg 
820 

Asp Met He 
835 

Asn Ala Val 



Leu Ala 
Thr Ala 
Gin Glu 
Leu Gly 
His Thr 
Thr Gly 
Glu Asn 
Phe Gin 
Ser Lys 
Thr Leu 
He Ala 
Asn Glu 
Glu Lys 
Glu Gin 
Leu Asp 
Leu Asn 
Lys Lys 
He Lys 
Leu Gin 
Thr He 
Lys Gin 
Ser Ser 
Ala Ala 
Thr Gly 
Glu Glu 
Asp Gly 
Leu Asp 
Lys Glu 



435 
He 
450 
Asn 
465 
Ser 
480 
Gly 
495 
Met 
510 
He 
525 
Ser 
540 
Leu 
555 
Leu 
570 
Glu 
585 
Ala 
600 
Lys 
615 
He 
630 
His 
645 
Ser 
660 
Gin 
675 
Lys 
690 
Asn 
705 
His 
720 
Arg 
735 
Val 
750 
Ala 
765 
His 
780 
Thr 
795 
Leu 
810 
Glu 
825 
Gin 
840 
Leu 
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PCT/US02/09809 



845 850 855 

Glu Lys Leu Gin His Ser Thr Glu Thr Glu Leu Thr Glu Ala Leu 
860 865 870 

Gin Lys Arg Glu Val Leu Glu Thr Glu Leu Gin Asn Ala His Gly 
875 880 885 

Glu Leu Lys Ser Thr Leu Arg Gin Leu Gin Glu Leu Arg Asp Val 
890 895 900 

Leu Gin Lys Ala Gin Leu Ser Leu Glu Glu Lys Tyr Thr Thr lie 
905 910 915 

Lys Asp Leu Thr Ala Glu Leu Arg Glu Cys Lys Met Glu He Glu 
920 925 930 

Asp Lys Lys Gin Glu Leu Leu Glu Met Asp Gin Ala Leu Lys Glu 
935 940 945 

Arg Asn Trp Glu Leu Lys Gin Arg Ala Ala Gin Val Thr His Leu 
950 955 960 

Asp Met Thr He Arg Glu His Arg Gly Glu Met Glu Gin Lys He 
965 970 975 

He Lys Leu Glu Gly Thr Leu Glu Lys Ser Glu Leu G).u Leu Lys 
980 985 990 

Glu Cys Asn Lys Gin He Glu Ser Leu Asn Asp Lys Leu Gin Asn 
995 1000 1005 

Ala Lys Glu Gin Leu Arg Glu Lys Glu Phe He Met Leu Gin Asn 

1010 1015 1020 

Glu Gin Glu He Ser Gin Leu Lys Lys Glu He Glu Arg Thr Gin 

1025 1030 1035 

Gin Arg Met Lys Glu Met Glu Ser Val Met Lys Glu Gin Glu Gin 

1040 1045 1050 

Tyr He Ala Thr Gin Tyr Lys Glu Ala He Asp Leu Gly Gin Glu 

1055 1060 1065 

Leu Arg Leu Thr Arg Glu Gin Val Gin Asn Ser His Thr Glu Leu 

1070 1075 1080 

Ala Glu Ala Arg His Gin Gin Val Gin Ala Gin Arg Glu He Glu 

1085 1090 1095 

Arg Leu Ser Ser Glu Leu Glu Asp Met Lys Gin Leu Ser Lys Glu 

1100 1105 1110 

Lys Asp Ala His Gly Asn His Leu Ala Glu Glu Leu Gly Ala Ser 

1115 1120 1125 

Lys Val Arg Glu Ala His Leu Glu Ala Arg Met Gin Ala Glu He 

1130 1135 1140 

Lys Lys Leu Ser Ala Glu Val Glu Ser Leu Lys Glu Ala Tyr His 

1145 1150 1155 

Met Glu Met He Ser His Gin Glu Asn His Ala Lys Trp Lys He 

1160 1165 1170 

Ser Ala Asp Ser Gin Lys Ser Ser Val Gin Gin Leu Asn Glu Gin 

1175 1180 1185 

Leu Glu Lys Ala Lys Leu Glu Leu Glu Glu Ala Gin Asp Thr Val 

1190 1195 1200 

Ser Asn Leu His Gin Gin Val Gin Asp Arg Asn Glu Val He Glu 

1205 1210 1215 

Ala Ala Asn Glu Ala Leu Leu Thr Lys Glu Ser Glu Leu Thr Arg 

1220 1225 1230 

Leu Gin Ala Lys He Ser Gly His Glu Lys Ala Glu Asp He Lys 

1235 1240 • 1245 

Phe Leu Pro Ala Pro Phe Thr Ser Pro Thr Glu He Met Pro Asp 

1250 1255 1260 

Val Gin Asp Pro Lys Phe Ala Lys Cys Phe His Thr Ser Phe Ser 
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1265 




1270 




1275 


Lys 


cys 


Xfir 


Lys Leu 


Arg Arg Ser 


lie Ser 


Ala Ser Asp 


Leu Thr 








1280 




1285 




1290 




Lys 


xie 


His Gly 


Asp Glu Asp 


Leu Ser 


Glu Glu Leu 


Leu Gin 








1295 




1300 




1305 


Asp 


Leu 


Lys 


Lys Met 


Gin Leu Glu 


Gin Pro 


Ser Thr Leu 


Glu Glu 








1310 




1315 




1320 


Ser 


His 


Lys 


Asn Leu 


Thr Tyr Thr 


Gin Pro 


Asp Ser Phe 


Lys Pro 








1325 




1330 




1335 


Leu 


Thr 


Tyr 


Asn Leu 


Glu Ala Asp 


Ser Ser 


Glu Asn Asn 


Asp Phe 








1340 




1345 




1350 


Asn 


Thr 


Leu 


Ser Gly 


Met Leu Arg 


Tyr He 


Asn Lys Glu 


Val Arg 








1355 




1360 




1365 


Leu 


Leu 


Lys 


Lys Ser 


Ser Met Gin 


Thr Gly 


Ala Gly Leu 


Asn Gin 








1370 




1375 




1380 


Gly 


Glu 


Asn 


Val 











<210> 23 
<211> 787 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7505914CD1 

<400> 23 



Met 


Trp 


Asp 


Gin 


Gly 


Gly 


Gin 


Pro 


Trp 


Gin 


Gin Trp Pro 


Leu 


Asn 


1 








5 










10 






15 


Gin 


Gin 


Gin 


Trp 


Met 

20 


Gin 


Ser 


Phe 


Gin 


His 
25 


Gin Gin Asp 


Pro 


Ser 
30 


Gin 


He 


Asp 


Trp 


Ala 
35 


Ala 


Leu 


Ala 


Gin 


Ala 
40 


Trp He Ala 


Gin 


Arg 
45 


Glu 


Ala 


Ser 


Gly 


Gin 
50 


Gin 


Ser 


Met 


Val 


Glu 
55 


Gin Pro Pro 


Gly 


Met 
60 


Met 


Pro 


Asn 


Gly 


Gin 

65 


Asp 


Met 


Ser 


Thr 


Met 

70 


Glu Ser Gly 


Pro 


Asn 

75 


Asn 


His 


Gly 


Asn 


Phe 
80 


Gin 


Gly 


Asp 


Ser 


Asn 
85 


Phe Asn Arg 


Met 


Trp 
90 


Gin 


Pro 


Glu 


Trp 


Gly 
95 


Met 


His 


Gin 


Gin 


Pro 
100 


Pro His Pro 


Pro 


Pro 
105 


Asp 


Gin 


Pro 


Trp 


Met 
110 


Pro 


Pro 


Thr 


Pro 


Gly 
115 


Pro Met Asp 


He 


Val 
120 


Pro 


Pro 


Ser 


Glu 


Asp 
125 


Ser 


Asn 


Ser 


Gin 


Asp 
130 


Ser Gly Glu 


Phe 


Ala 
135 


Pro 


Asp 


Asn 


Arg 


His 
140 


He 


Phe 


Asn 


Gin 


Asn 
145 


Asn His Asn 


Phe 


Gly 
150 


Gly 


Pro 


Pro 


Asp 


Asn 

155 


Phe 


Ala 


Val 


Gly 


Pro 
160 


Val Asn Gin 


Phe 


Asp 
165 


Tyr 


Gin 


His 


Gly 


Ala 
170 


Ala 


Phe 


Gly 


Pro 


Pro 
175 


Gin Gly Gly 


Phe 


His 
180 


Pro 


Pro 


Tyr 


Trp 


Gin 
185 


Pro 


Gly 


Pro 


Pro 


Gly 
190 


Pro Pro Ala 


Pro 


Pro 
195 


Gin 


Asn 


Arg 


Arg 


Glu 
200 


Arg 


Pro 


Ser 


Ser 


Phe 
205 


Arg Asp Arg 


Gin 


Arg 
210 
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Ser Pro lie 
Ala Val Lys 
Glu Lys Met 
Met Glu Gin 
Thr Glu Asp 
Ser Lys Phe 
Glu Ala Ala 
Pro Gin Glu 
Glu Tyr Gin 
Leu Leu Asp 
Ala His Arg 
Gly Asp Ser 
Asp Thr Asp 
Glu Ala Phe 
Lys Gin Met 
Glu Met Asn 
Leu Glu Ala 
Arg Thr Pro 
His Lys. Glu 
Ser Ser Gly 
Ser Thr Val 
Arg Thr Ser 
Ser Arg Ser 
Arg Ser Tyr 
Val Lys lie 
Arg Glu Arg 
Ser Arg Ser 
Ser Arg Ser 



Ala Leu Pro 

215 
Arg Arg Thr 

230 
Glu Arg Glu 

245 
Gin Arg Ser 

260 
Ala Glu Gly 

275 
Asp Ser Asp 

290 
Ser Ser Gly 

305 
Glu His Ser 

320 
Met Met Leu 

335 
Val Thr Asp 

350 
Lys Ala Thr 

365 
Glu Asp Glu 

380 
Asp Glu Glu 

395 
Trp Arg Lys 

410 
Glu Glu Glu 

425 
Glu Phe He 

440 
Arg Glu Ala 

455 
Asn Glu Thr 

470 
Lys Glu Lys 

485 
Ser Ser Ser 

500 
Ser Ser Ser 

515 
Ser Arg Ser 

530 
Arg Ser Pro 

545 
Ser Arg Arg 

560 

Arg Asp Arg 
575 

Arg Arg Asn 

590 
Arg Ser Arg 

605 

Arg Asp Arg 
620 



Val Lys Gin 
Leu Pro Ala 
Lys Gin Lys 
Gin Leu Ser 
Gly Asp Gly 
Glu Glu Glu 
Lys Val Thr 
Asp Pro Glu 
Leu Thr Lys 
Glu Glu He 
Lys Gly Gly 
Arg Ser Asp 
Leu Arg His 
Glu Lys Glu 
Lys Gin Gin 
His Lys Glu 
Asp Gly Asp 
Thr Ser Val 
Gin Gly Arg 
Ser Asn Ser 
Ser Tyr Ser 
Ser Ser Pro 
Thr He Lys 
He Lys He 
Arg Arg Ser 
Arg Ser Pro 
Asp Arg Arg 
Arg Lys He 



Glu Pro Pro 
220 

Trp He Arg 

235 

Lys Leu Glu 
250 

Lys Lys Glu 
265 

Pro Arg Leu 
280 

Glu Asp Thr 
295 

Arg Ser Pro 
310 

Met Thr Glu 

325 

Met Leu Leu 
340 

Tyr Tyr Val 
355 

Leu Gly Gly 
370 

Arg Gly Ser 
385 

Arg He Arg 

400 

Gin Gin Leu 
415 

Thr Glu Arg 
430 

Gin Asn Ser 
445 

Val Val Asn 
460 

Leu Glu Pro 

475 

Ser Arg Ser 
490 

Arg Thr Ser 

505 

Ser Ser Ser 
520 

Lys Arg Lys 
535 

Ala Arg Arg 
550 

Glu Ser Asn 
565 

Asn Arg Asn 
580 

Ser Arg Glu 
595 

Thr Asn Arg 
610 

Asp Asp Gin 
625 



Gin He Asp 
225 

Glu Gly Leu 

240 

Lys Glu Arg 
255 

Lys Lys Ala 
270 

Pro Gin Arg 
285 

Glu Asn Val 
300 

Ser Pro Val 

315 

Glu Glu Lys 
330 

Thr Glu He 
345 

Ala Lys Asp 
360 

Tyr Gly Ser 
375 

Glu Ser Ser 
390 

Gin Lys Gin 

405 

Leu His Asp 
420 

Val Thr Lys 
435 

Leu Ser Leu 
450 

Glu Lys Lys 
465 

Lys Lys Glu 
480 

Gly Ser Ser 
495 

Ser Thr Ser 
510 

Gly Ser Ser 
525 

Lys Arg His 
540 

Ser Arg Ser 
555 

Arg Ala Arg 
570 

Ser He Glu 
585 

Arg Arg Arg 
600 

Ala Ser Arg 
615 

Arg Gly Asn 
630 
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Leu Ser 
Glu Arg 
Lys hys 
Lys Gin 
Asp Arg 
Ser Gly 
Asp Ser 
Ser Asp 
Lys Glu 
Val Glu 
Lys Ser 



Gly Asn 
Lys Lys 
Asp Lys 
Lys Arg 
Leu Lys 
Ser lie 
Lys Lys 
Ser Ser 
Lys Lys 
Lys Ser 
Lys Ser 



Ser His 
635 

Glu Arg 
650 

Glu Arg 
665 

Glu Glu 
680 

Arg Lys 
695 

Ser Val 
710 

Ser Thr 
725 

Gly Arg 

740 

Ala Lys 
755 

Gin Arg 
770 

Arg Ser 
785 



Lys His 
Ser Arg 
Glu Arg 
Lys Asp 
Arg Glu 
Lys lie 
Thr Lys 
Ser Ser 
Lys Pro 
Ser Gly 
Arg 



Lys Gly 

640 
Ser lie 

655 
Glu Gin 

670 
Phe Lys 

685 
Ser Glu 

700 
He Arg 

715 
Asp Ser 

730 
Ser Glu 

745 
Lys His 

760 
Lys Lys 

775 



Glu Ala 
Asp Lys 
Asp Lys 
Phe Ser 
Arg Thr 
His Asp 
Lys Lys 
Ser Pro 
Ser Arg 
Ala Ser 



Lys Glu Gin 
645 

Asp Arg Lys 
660 

Arg Lys Glu 
675 

Ser Gin Asp 
690 

Phe Ser Arg 
705 

Ser Arg Gin 

720 

His Ser Gly 
735 

Gly Ser Ser 

750 

Ser Arg Ser 
765 

Arg Lys His 
780 



<210> 24 

<211> 3332 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc^featxire 

<223> Incyte ID No: 71230017CB1 



<400> 24 

gccgggcttt gggttctggg cctctgccgc 
acggcagctt ctgacgctgg gccattggac 
ttccgccgcg aagcgccagt ccgggccgag 
cattcatgtt ctgaggaggg tgtgagaagg 
cattgtggaa gaagtggcct gtcccatctg 
tgactgtggc cacagcttct gccacagctg 
atcccagaac tggggttaca cctgtcccct 
gcggcctaat tggcagctgg ccaatgttgt 
aggaatgggg ctgaagggtg acctgtgtga 
caaagaggat gtcttgataa tgtgtgaggc 
cagtgttgtg ccaatggagg atgttgcctg 
cgaacatctg aagaaagagc aagaagaggc 
aactgccacc tggaagatac aggtggaaac 
aaaataccag cgattactag agaaaaagca 
agcagcagct ctggccagcc tacagcggga 
gaaccatagc gagctcatcc agcagagcca 
agagaggtcg cagaggcctg tccgctggat 
gagcaaatct tggagcttgc agcagccaga 
ccgtgtgctg gggctaagag agatcctgaa 
agatactgct tactcccgtc tcatcgtgtc 
caccaaccag aaactgccag acaatcctga 
aagccagtgc atctcctcag gccggcacta 
gtggggcctg ggagtatgta agcaaaatgt 



tctctggccc taagtgctga gctgccggga 60 
gctgcggaac caggcttctt cactttgagt 120 
gagggagcct ttactacttc tccctggttt 180 
aaccatggat cccacagcct tggtggaagc 240 
tatgaccttc ctgagggagc ccatgagcat 300 
tctctctgga ctctgggaga tcccaggaga 360 
ctgtcgagct cctgtccagc caaggaacct 420 
agaaaaagtc cgtctgctaa ggctacatcc 480 
gcgccatggg gaaaagctga agatgttctg 540 
ctgcagccag tccccagagc atgaggccca 600 
ggagtacaag tgggaacttc atgaggccct 660 
ctggaagctt gaagttggtg aaaggaaacg 720 
ccgaaaacag agtattgtat gggagtttga 780 
gccaccacat cggcagctgg gggcagaggt 840 
ggcagcggag accatgcaga aactggagtt 900 
ggtcctgtgg aggatgattg cagagttgaa 960 
gttgcaggat attcaggaag tgttaaacag 1020 
accaatctcc ctggagttga agacagattg 1080 
gacttatgca gctgatgtgc gcttggatcc 1140 
tgaggacaga aaacgtgtgc actatggaga 1200 
gagattttac cgctataata tcgtcctggg 1260 
ctgggaggtg gaggtgggag acaggtctga 1320 
agaccggaag gaggtggtct acttatcccc 1380 
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ccactatgga ttctgggtga taaggctgag gaagggaaat gagtaccgag caggcaccga 1440 

tgagtaccca atcctgtcct tgccggtccc tcctcgccgg gtgggaatct tcgtggatta 1500 

tgaggcccat gacatttctt tctacaatgt gactgactgt ggctcccaca tcttcacttt 1560 

cccccgctat cccttccctg ggcgcctcct gccctatttt agtccttgct acagcattgg 1620 

aaccaacaac actgctcctc tggccatctg ctccctggat ggggaggact aagaaagcta 1680 

ccaccctaac cacagaggct tggaattggg cctggccccc atggggcttg gaggaccgag 1740 

ccactgacag gtatcccctg aaactgagct gagcccagta tccaaggatt cctctgtctg 1800 

atcctttggt ctttgctacc aggctgaagt ctgtcatgaa accacttatt ttaaaaagca 1860 

gaggcccagt caaatgagca ttgcatccca tgagggaagc acgacagggc tgatggtgag 1920 

gatcagagca gttctaaggt gactcgttgg ggtaaggatc aggactttgt ccatgtagta 1980 

gccaaccacc ctcttccctg attcccgtcc ggtgtcacag ttcagtcagt gaggatgatg 2040 

aagtagatac agtcttcagg acaccattag atgggctttc ccaataggcc aaaaaaatgc 2100 

tgcgcatacc cagagctggt tgttgtgctg aggccagtca gaggatgctt cccctgaggt 2160 

ttgctataac taagcaacct ttatgtgact ctcaccttct gacctcctgg caagagaaat 2220 

tcagtgcagc agggggacac agacctgccc aagccacccc actgccgttc cctctctgag 2280 

cacaagctgg gcaaatcact gtcccttgga ctccagtaga ccagtgtcct agtcttgcct 2340 

tttttctcta agtggcagga tcagaaaacc tgcgagcttt agtttgtatt ttcactttat 2400 

gaatgaggaa actgaaatgg ccttaaggga gcaagttatt tctttttttt tgacacggag 2460 

tctcgctctg ttgcccaggc tggagtgcag tggcacgatc tcggctcact gcaggctctg 2520 

cctcctgggt tcacgccatt ctcctgcctc agcttcccga gtagctggga ctacaggcgc 2580 

ccaccacgac gcctggctca tttttttgta tttttagtag agacggggtt tcaccatgtt 2640 

agctaggatg gtctcgatct cctgacctca tgatccgccc tcctcagcct cccacagtgc 2700 

tgggattaga ggcatgagcc actgcgcccg gcccctggag caagttattt cttacaaagc 2760 

tgctgaaggt aagattatca aaattataaa gcatttttca cactcaagtg aaacaaggtt 2820 
gacaaactca cttcgcaggt cacatgccta tacatcactt attatatttg ggtctgaaac 2880 

ttctcacatg tttgggaggt tttatgtgtc ctcattggga aaatgggtgt aattcagcat 2940 
aaaacctcat atgattgtcc tgcctcatgg agctgttgta tagatcccag atccatccca 3000 

tgatttgttc ctgtctgagg catagaggca ggcaagccgt ggattttgca catggtgact 3060 

ttcccactgt gccatgatac agtctgcatc ttatagcagt gcctttgtct cagggcctct 3120 

gctggcagtc tagacctttt gggcagaaag gagcttcaaa tggctgtgat aaggaatatt 3180 

aaaaattgtg tttctacttt aattgtattg gctgttcatg tatgtaggag ttaaaatagg 3240 

ccaaactgga gaaataaacg cattctgtcc accatgaaaa aaaaaaaaaa aaaaaaaaaa 3300 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa 3332 

<210> 25 

<211> 4410 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> inisc_feature 

<223> Incyte ID No: 312503 6CB1 

<400> 25 

atggaatcta gttcatcaga ctactataat aaagacaatg aagaggaaag tttgcttgca 60 
aatgttgctt ccttaagaca tgaactgaag ataacagaat ggagtttgca gagtttaggg 120 
gaagagttat ccagtgttag tccaagtgaa aattctgatt atgcccctaa tccttcaagg 180 
tctgaaaagc taattttgga tgttcagcct agccaccctg gacttttgaa ttattcacct 240 
tatgaaaacg tctgtaaaat atctggtagc agcactgatt ttcaaaaaaa gccaagagat 300 
aagatgtttt catcttctgc ccctgtggat caggagatta aaagccttcg agagaaacta 360 
aataaactta ggcaacagaa tgcttgtttg gtcacacaga atcattcctt aatgactaaa 420 
tttgaatcta ttcactttga attaacacag tcaagagcaa aagtttctat gcttgagtct 480 
gctcaacagc aggcagccag tgtcccaatc ttagaagaac agattataaa tttggaagca 540 
gaggtttcag ctcaagataa agttttgaga gaggcagaaa ataagctgga acagagccag 600 
aaaatggtaa ttgaaaagga acagagtttg caggagtcca aagaggaatg tataaaatta 660 
aaggtggact tacttgaaca aaccaaacaa ggaaaaagag ctgaacgaca aaggaatgaa 720 
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gcactatata atgccgaaga gctgagtaaa 
gaaaaactgg aaaaggtaaa aggcagttgt 
attccaacag taaaggttca agctgaagaa 
gaaaaagaaa ataaaaggct acaagaaagg 
ctgaaagaga aattaaggca gttaaaagaa 
atcatggcag tgaaaaattc agaagtcatg 
ttgaagctag agagtgagtt agagaacaaa 
atgaatgaaa accgagaatt aaaggtccgt 
tgtcaacaag aaattgaaag ttcaagggta 
cagttgccat taaaaagaga attatttggc 
agtagcttct caaacaagga agaccgttgc 
atttcggaat tgagaattaa gcttgcaata 
aacctgactg caaatcagtt atctcagagt 
agcaaattaa gtagtttaga aacagaacct 
agcgtaaaag atcaaaatca acatactatg 
cttgttactg gaatagaaga actacgtact 
gatttgaagg ttaacatggc tcacagaact 
ctagagaaag cttcaaactc cagcaaactg 
cttttaactc ttgagaaaca gctggaagaa 
aaaaatgcag aactagaaca ggagcttatg 
accaatatta atacagagca tgagaaaatt 
cacttggaac agcataaaga aatggaaaag 
gcattggaaa tttgtaagga agaacttgtc 
gaaaagtttg aaaaacagtt aaagaagaaa 
ctaaagataa aaaatcacag tcttcaagag 
actcttcagc aacagcagca aatgttacaa 
gatactcaaa ctaaacttga aaaacaggtg 
agggaaagtt cagctgaaaa gttgagaaaa 
gaagcagatt tgaaaaggca aaaagtgatt 
attgagatgg atcagtacaa agaagagctg 
aaacgagatg gagaaaataa agcaatgcac 
acaaagacag agctagaaaa gaaaacaaat 
agtactgaaa ctgaactaac agaagccttg 
caaaatgctc atggagaatt aaaaagtact 
ctacagaagg ctcaattatc attagaggaa 
gaacttagag aatgcaagat ggagattgaa 
caggcactta aagagagaaa ttgggaacta 
gatatgacta ttcgtgagca cagaggagaa 
actctggaga aatcagaatt ggaacttaaa 
gacaaattac aaaatgctaa agaacaggtt 
gaacaggaga taagtcaact gaaaaaagaa 
atggagagtg ttatgaaaga gcaagaacag 
gatttggggc aagaattgag gctgacccgg 
gcagaggctc gtcatcagca agtccaagca 
ctggaggata tgaagcaact ctctaaagag 
gaactggggg cttctaaagt acgtgaagct 
aagaaattgt cagcagaagt agaatctctc 
catcaagaga accatgcaaa gtggaagatt 
caactaaacg aacagttaga gaaggcaaaa 
agcaatttgc atcaacaagt ccaagatagg 
ttacttacta aagaatcaga attaaccaga 
gcagaagaca tcaagtttct gccagcccca 
gttcaagatc caaaatttgc taaatgtttt 
cgtcgctcta ttagtgccag tgatcttact 
gaagaattac tacaggactt aaagaaaatg 
agccataaga atctgactta cacccagcca 



gctttccaac aatataaaaa aaaagtggct 780 
gcaaattcag tgttttgtat tactgtctat 840 
gaaatattag agagaaatct aactaactgt 900 
tgtggtctat ataaaagtga acttgaaatt 960 
gaaaataaca acggaaaaga aaaattaagg 1020 
gcacaactaa ctgaatctag acaaagtatt 1080 
gacgaaatac ttagagacaa attttcttta 1140 
gttgcagcac agaatgagcg actagattta 1200 
gaactaagaa gtttggaaaa gattatatcc 1260 
tttaaatcat atctttctaa ataccagatg 1320 
attggctgct gtgaggcaaa taaattggtg 1380 
aaagaggcag aaattcaaaa gcttcatgca 1440 
c t tat tact t gtaatgacag ccaagaaagt 1500 
gtaaagctag gtggtcatca agtagcagaa 1560 
aacaagcaat atgaaaaaga gaggcaaaga 1620 
aagctgatac aaatagaagc tgaaaattct 1680 
agtcagtttc agctgattca agaggagctg 1740 
gaaagtgaaa tgacaaagaa atgttctcaa 1800 
aagatagttg cttattcctc tattgctgca 1860 
gaaaagaatg aaaagataag gagtctagaa 1920 
tgtttagcct ttgaaaaagc aaagaaaatt 1980 
cagattgaaa gagttaggca actagattca 2040 
ttgcatttga atcaattgga aggaaataag 2100 
tctgaagagg tatattgttt acagaaagag 2160 
acttctgagc aaaacgttat tctacagcat 2220 
caagagacaa ttagaaatgg agagctagaa 2280 
tcaaaactgg aacaagaact tcaaaaacaa 2340 
atggaggaga aatgtgaatc agctgcacat 2400 
gagcttactg gcactgccag gcaagtaaag 2460 
tctaaaatgg aaaaggaaat aatgcaccta 2520 
ctctctcaat tagatatgat cttagatcag 2580 
gctgtaaagg agttagaaaa gttacagcac 2640 
caaaaacggg aagtacttga gactgaacta 2700 
ttaagacaac tccaggaatt gagagatgta 2760 
aaatacacta ctataaagga tctcacagct 2820 
gacaaaaagc aggagctcct tgaaatggat 2880 
aagcaaagag cagctcaggt tacacatttg 2940 
atggaacaaa aaataattaa attagaaggt 3 000 
gaatgtaaca aacagataga aagtctgaat 3060 
cgagaaaaag agtttataat gctacaaaat 3120 
attgaaagaa cacaacaaag gatgaaagaa 3180 
tacattgcca ctcagtacaa ggaggccata 3240 
gagcaggtgc agaactctca tacagaattg 3300 
cagagagaaa tagaaaggct ctctagtgaa 3360 
aaagatgctc atggaaacca tttagctgaa 3420 
catttagaag caagaatgca agcagaaatc 3480 
aaagaagctt atcatatgga gatgatttca 3540 
tctgctgact ctcaaaagtc ttctgttcag 3600 
ttggaattag aagaagctca ggatactgta 3660 
aatgaagtaa ttgaagctgc aaatgaagca 3720 
ttacaggcca aaatttctgg acatgaaaag 3780 
tttacatctc caacagaaat tatgcctgat 3840 
cacacatctt tttccaagtg tacaaaatta 3900 
ttcaaaattc atggtgatga agatctttct 3960 
caattagaac agccttcaac attagaagaa 4020 
gactcattta aacctctcac atataaccta 4080 
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gaagctgata gttctgagaa taatgacttt 
aacaaagaag taagactatt aaaaaagtct 
ggagaaaatg tgtaattcaa agaagatact 
tgtgctgttt acttattata tgtagctcat 
ataaatttta tatttcaata ttttaaaaga 
aagaaaattt aaaaaaaaaa aaaaaggggg 

<210> 26 
<211> 5032 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 1758089CB1 



aacacgctta gtgggatgct aagatacata 4140 
tctatgcaaa caggtgctgg tttaaatcag 4200 
gatgtgttga aaaaatggaa tttttggtac 4260 
acttcataga agctgttatt ttgcttttga 4320 
aagcccttct aaaacttaat tatattttta 4380 

4410 



<400> 26 

ccggcccgag cggggcctgg gggtgcgacg 
ccggaccggg ccgcgcacgc cgcctcagga 
atcctagcga atttttggag catctccacc 
ccgcacagct gcagccatgg ggtctgagga 
catgacgttt cgcccaacca tggaagaatt 
agagtcgcag ggagcccacc gggcgggcct 
gccgcggcag acgtatgatg acatcgacga 
ggtgacgggc cagtcgggcc tcttcacgca 
gggcgagtac cgccgcctgg ccaacagcga 
tgacgacctt gaacgcaaat actggaagaa 
tgacatcagc ggctctttgt atgatgacga 
gaccatcctg gacatggtgg agcgcgagtg 
ctacctgtac ttcggcatgt ggaagaccac 
gtacagcatc aactacctgc actttgggga 
gcacggcaag cgcctggagc ggctggccat 
cgacgccttc ctgcggcata agatgaccct 
gatccccttc agccggatca cgcaggaggc 
ctaccacgcc ggcttcaatc acgggttcaa 
gcggtggatt gactacggca aagtggccac 
gatctccatg gacgtgttcg tgcgcatcct 
gggcaaggac ctcacggtgc tggaccacac 
gagctcctgg agtgcgtccc gggcctcgct 
gaaacggagc cagcccaaga agccgaagcc 
ggctggggca gcgctcctag aggaggctgg 
ggttgacccc. gaggaggagg aggaggagcc 
gggcgcagaa gaggacggga ggggcaagct 
gaagaagagc ttcggcctgc tgcccccaca 
agaggaggcg ctgtggctgc catccccact 
agccatggag gagagccccc tgccggcacc 
tgaggagcta gaggccaagc ctcggcccat 
gggcaaggca gccttcaacc aggagcacgt 
ccagaagggt ccgacctgga aggaaccagt 
cggtgcagcc agcagtgggg caggtcgcat 
ggcaccgtcc acattttcca aattgaagat 
gggccggccg cccacccggt ccccactgtc 
ggaggcatcc cctttctccg gggaggaaga 
gctgtctctg cagtggaaga acagggcggc 
agcggctgcg cgcacggagc cctactgcgc 
ggccctacag actgagaagg aggcacccat 



ccgagggcgg gggagagcgc gccgctgctc 60 
accatcactg ttgctggagg cacctgacaa 120 
caggaacctc gccatccaga agtgtgcttc 180 
ccacggcgcc cagaacccca gctgtaaaat 240 
taaagacttc aacaaatacg tggcctacat 300 
ggccaagatc atccccccga aggagtggaa 360 
cgtggtgatc ccggcgccca tccagcaggt 420 
gtacaatatc cagaagaagg ccatgacagt 480 
gaagtactgt accccgcggc accaggactt 540 
cctcaccttt gtctccccga tctacggggc 600 
cgtggcccag tggaacatcg ggagcctccg 660 
cggcaccatc atcgagggcg tgaacacgcc 720 
cttcgcctgg cacaccgagg acatggacct 780 
gcctaagtcc tggtacgcca tcccaccaga 840 
cggcttcttc cccgggagct cgcagggctg 900 
catctcgccc atcatcctga agaagtacgg 960 
cggggaattc atgatcacat ttccctacgg 1020 
ctgcgcagaa tctaccaact tcgccaccct 1080 
tcagtgcacg tgccggaagg acatggtcaa 1140 
gcagcccgag cgctacgagc tgtggaagca 1200 
gcggcccacg gcgctcacca gccccgagct 1260 
gaaggccaag ctcctccgca ggtctcaccg 1320 
cgaagacccc aagttccctg gggagggtac 1380 
gggcagcgtg aaggaggagg ctgggccgga 1440 
gcagccactg ccacacggcc gggaggccga 1500 
gcggccaacc aaggccaaga gcgagcggaa 1560 
gctgccgccc ccgcctgctc acttcccctc 1620 
ggagcccccg gtgctgggcc caggccctgc 1680 
ccttaatgtc gtgccccctg aggtgcccag 1740 
catccccatg ctgtacgtgg tgccgcggcc 1800 
gtcctgccag caggcctttg agcactttgc 1860 
ttcccccatg gagctgacgg ggccagagga 1920 
ggagaccaaa gcccgggccg gagaggggca 1980 
ggagatcaag aagagccggc gccatcccct 2040 
ggtggtgaag caggaggcct caagtgacga 2100 
tgtgagtgac ccggacgcct tgaggccgct 2160 
cagcttccag gccgagagga agttcaacgc 2220 
catctgcacg ctcttctacc cctactgcca 2280 
agcctccctc ggagagggct gcccggccac 2340 
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attaccctcc aaaagccgtc agaagacccg 
tggcggtgag aacacggagc cgctgcctgc 
ccccctgatc gcctgcggca agtgctgcct 
tcccgagctg gtcaatgaag gctggacgtg 
ggagtgttgc ctgtgcaacc tgcgaggagg 
gatccacgtg atctgtgcca tcgcagtccc 
ccaccctgtg gacatcagcg ccatccccga 
ccggaagcgg atgaagaagg tgtcaggtgc 
cacgtccttc cacgtgacct gcgcccacgc 
gccctatgtg gtctccatca cctgcctcaa 
cctgagggcc gtgtccctag gccaggtggt 
ccgctgtcgc gtcatcggtg ccgcctcgca 
ctcctacagc gacaacctgt accctgagag 
acccccttcc gagggggagc tggtggagct 
caagttcatc tcctccgtca ccagccacat 
gctgacggtg aagcgtgggg acatcttcac 
ctctcggctg tcactgagca cgggggcacc 
caaggccgcc aagcgcccgc gtgtgggcac 
ccaggactac gtggccttcg tggagagcct 
cttctaggac agctggccgc tcaggcgacc 
ccgggcgttc gcttgctgtg aattcctgtc 
ccaagccgcg ggtgccccct agggcgacag 
actcagggag cagggccagg cgggctcggg 
ctcagaattt taaaccatgt aagctctctt 
actgagcaac ctttgagatt gtcacttctg 
ataaatacat attgtttaaa aaaaagcaag 
aaagttgttt tctagatttg tggctttaag 
tctcagaacc aggattctct gagaggtcag 
aaaatattat gatttggcta cagaccaggc 
gcctcggggg gggggcagga cgccccggtt 
tctcgcctca ccccggctcc tgggctttga 
gtgcctgctg ggaggaggcc caggctctct 
aagcccgggg gtctggggcc tccctccgtc 
ctggggtttg tctatctttg tttctctcac 
ccttgcagac aaagcacccc tgcacctccc 
cttctggttg gtagtgagtg tggacagctt 
tcgggggaca cagggccgag gcaggccttc 
acgaggcctg gtgtcctcgc tccacccacc 
tcttaggagt gggttgagct gatagagaaa 
cttctccagg tgcctctccc tcaccagctc 
agctgtctcc tgccccaggg agggatggag 
atggctgagg caggagtttg ggaccagcct 
attttttaca aattagctag gtgtggtggt 
ctgtggtggg aggattgctt gagtccagga 
cactgcactc cagcctgggc aacagagcga 

<210> 27 

<211> 1355 

<212> 0NA 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 3533891CB1 

<400> 27 



accgctcatc cctgagatgt gcttcacctc 2400 
caactcctac atcggcgacg acgggaccag 2460 
gcaggtccat gccagttgct atggcatccg 2520 
ttcccggtgc gcggcccacg cctggactgc 2580 
tgcgctgcag atgaccaccg ataggaggtg 2640 
cgaggcgcgc ttcctgaacg tgattgagcg 2700 
gcagcggtgg aagctgaaat gcgtgtactg 2760 
ctgtatccag tgctcctacg agcactgctc 2820 
cgcaggcgtg ctcatggagc cggacgactg 2880 
gcacaagtcg gggggtcacg ctgtccaact 2940 
catcaccaag aaccgcaacg ggctgtacta 3000 
gacctgctac gaagtgaact tcgacgatgg 3060 
catcacgagt agggactgtg tccagctggg 3120 
ccggtggact gacggcaacc tctacaaggc 3180 
ctaccaggtg gagtttgagg acgggtccca 3240 
cctggaggag gagctgccca agagggtccg 3300 
gcaggagccc gccttctcgg gggaggaggc 3360 
cccgcttgcc acggaggact ccgggcggag 3420 
cctgcaggtg cagggccggc ccggagcccc 3480 
ctcagcccgg cggggaggcc atggcatgcc 3540 
ctcgtgtccc cgacccccga gaggccacct 3600 
gagccagcgg gacgccgcac gcggccccag 3660 
ggccggccag gggagcaccc cactcaacta 3720 
cttctcgaaa aggtgctact gcaatgccct 3780 
tacataaacc acctttgtga ggctctttct 3840 
aaaaaaagga aaacaaagga aaatatcccc 3900 
aaaaacaaaa caaaacaaac acattgtttt 3960 
agcatctcgc tgtttttttg ttgttgtttt 4020 
agggaaagag acccggtaat tggagggtga 4080 
tcggcacagc ccggtcactc acggcctcgc 4140 
tggtctggtg ccagtgcctg tgcccactct 4200 
ggtggccgcc cctgtgcacc tggccagggg 4260 
tgcgcccacc tttgcagaat aaactctctc 4320 
ccgagagaaa cgcaggtgtt ccagaggctt 4380 
atggctcagg atgagggagg cccccaggcc 4440 
cccagctctt cgggtacaac cctgagcagg 4500 
ggggcccctt tcgcctgctt ccgggcaggg 4560 
cacgctgctg tcacctgagg ggaatctgct 4620 
aaacggcctt cagcccaggc tgggaagcgc 4680 
tgcacccctc tggggagcct tccccacctt 4740 
gagataattt gcttatatta aaaacaaaaa 4800 
gggctatata gcaagacccc atcactacaa 4860 
gcgcacctgt ggtcccagct actcgggagg 4920 
ggttgaggct gcagtcagct cagattgcac 4980 
gaccctgtct ccaaaaaaaa aa 5032 
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cggggacgcg cgcccggcct gtcgctgtgg aaaccgctag gccagcgctc gccgggacct 60 

ggaatccctg tacgccgagg tgggagccgg tggaccggtc ccccagccgg cccccacctc 120 

cgcttcccgg tgtttgaggg ttcgggcctc ccgccgggga gttcacccct cgggctcgtc 180 

agtagggctg tggctgtcgc ctcttcctgc agcgccaggc tccgcccggt ctcacagtcg 240 

gcttaggggc tttgcgtgca ctgcggttgg gtggaaaaac ccactcctgg ttgtttagac 300 

gttggcctgc agacgatgtc atttctgtat tcctctaagg caggaagtca ttatgcaact 360 
tacacatatt catcagattt cctctgactt acccggacat gtacatggga atgatgtgca 420 

ctgccaagaa atgtgggatt aggtttcagc ctccagctat tatcttaatc tatgagagtg 480 

aaatcaaggg gaaaattcgc cagcgcatta tgccagttcg aaacttttca aagttttcag 540 

attgcaccag agctgctgaa caattaaaga ataatccgcg acacaagagt tacctagaac 600 

aagtatccct gaggcagcta gagaagctat tcagtttttt acgaggttac ttgtcggggc 660 

agagtctggc agaaacaatg gaacaaattc aacgggaaac aaccattgat cctgaggaag 720 

acctgaacaa actagatgac aaggagcttg ccaaaagaaa gagcatcatg gatgaacttt 780 

ttgagaaaaa tcagaagaag aaggatgatc caaattttgt ttatgacatt gaggttgaat 840 

ttccacagga cgatcaactg cagtcctgtg gctgggacac agagtcagct gatgagttct 900 

gataccaaac actcaaaaca tgcattgggc tagcagaata tccatgttta ttaccagact 960 

ggttctggaa gaagctgtaa agaatactaa atatgttggg ttatagggga ttgaccatgt 1020 

tacttttcaa aaccaggaca tttaaagcat ctactatgta ggtgcatgag gagtatggga 1080 

aaaacagaat aaaggaatct gcctttaagg agcttacaat catgccgggt gcggtggctc 1140 

acgcctgtaa tcccagcact ttgggaggct gaggcgggtg gatcacctaa ggtcaggagt 1200 

tcgagaccag cctagccaac atggtgaaac ctcgcctcta ctaaaaatac aaaaattagc 1260 

caggcgtggt ggcgggtgcc tgtaatcccg gctactcagg aggctgaggc aggagattcg 1320 
cttgaacctg ggattaactg acgttgcagt gagcc 1355 

<210> 28 

<211> 4912 

<212> UNh 

<213> Homo sapiens 

<220> 

<221> niisc_feature 

<223> Incyte ID No: 1510943CB1 

<400> 28 

cgggccccag cggcggcagc ggagagcgcg gtcccgggtc ggagcctggg acacctccgc 60 
acggacgggg cgggcggcgc ggacaggcca tggggacccg ggccgggcca gcggtggcgg 120 
gccagcggga gccccgggcc tgagaagtgg gcggcggggt ggcgggggcc atgacctcgg 180 
tgtggaagcg cctgcagcgc gtgggcaagc gggcggccaa gttccagttc gtggcctgtt 240 
accacgagct agtgttggag tgcaccaaga aatggcagcc agataagctg gtggtggtat 300 
ggacccgtcg gaaccgacgc atctgctcca aggcccacag ctggcagccg ggcatccaga 360 
acccataccg gggcaccgtg gtgtggatgg tacctgagaa tgtggacatc tctgtgaccc 420 
tctacaggga cccccacgtg gaccagtatg aggccaaaga gtggacattt attattgaaa 480 
atgagtctaa ggggcagcgg aaggtgctgg ccacggccga ggtggacctg gcccgccatg 540 
cagggcccgt gcctgtccaa gtcccactga ggctgcggct gaagccaaag tcagtgaagg 600 
tggtgcaggc tgagctgagc ctcactcttt ccggggtgct gctgcgggag ggccgtgcca 660 
cggacgatga catgcagagt ctcgcaagcc tcatgagtgt gaagcctagt gatgtgggca 720 
acttggatga ctttgctgag agtgatgaag atgaggctca tggcccagga gccccggagg 780 
cccgggctcg agtcccccag ccagatccct ctcgagagct gaagacgctt tgtgaggagg 840 
aggaggaagg ccaaggacga ccccagcagg cagttgccag cccttctaat gctgaggata 900 
ccagcccagc ccctgtgagt gctcctgcac ccccagccag aacctcccga ggccaggggt 960 
cagaacgagc taatgaagcg gggggccagg taggccctga ggccccaagg cccccggaaa 1020 
cctcaccaga gatgaggtct tcaaggcagc cagcccagga cacggccccc accccagccc 1080 
ctcggctccg gaaaggctct gatgccctcc ggcccccagt cccccagggg gaagatgagg 1140 
tccccaaagc ctcaggggct cctccagcag gattgggctc tgctagggag acccaggccc 1200 
aggcatgccc tcaggaaggg acagaagccc atggagctag gctgggcccg agcattgagg 1260 
ataaaggttc tggagaccct tttggaaggc agagactcaa ggctgaagag atggacactg 1320 
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aggacaggcc agaggccagt ggggtggaca 
acactaagag gtcaggagtc agagctgggg 
tggatgctga gcagaggtca aaggtgagac 
gggtgatgcc tgaggcaaga tgcaggggga 
ggaggctggg agtcaggacc agggatgagg 
agcctgcagg gcactctggg caacttggtg 
aggagagaga gggtgcagaa gtgaggggtg 
agcagggccc ttctgttgga gcaataagca 
ccctgttatc aactgcccag ggggcaatat 
ctgggggttc aggggtcctg gaaacagaga 
cccaggagaa agaagttgag gggtcagggt 
tattgggggc cttggagaaa gaagcagcaa 
ggacagcaca gtgtgaggga ctggagaccc 
cagggacaga gactgaggta ttggggaccc 
tgcagacaag aactacgata gcagagactg 
atttagggcc actgaagata gaagatacaa 
cagaggtgga agcttctagg gtaccagagt 
tagggaccca ggagataaca gctagggatt 
cagagtctga catattggta gcccaggaga 
tagagactgg ggcagcagaa ggtgcgatat 
caggggtccc agggttagaa gctgatacaa 
cagaggttcc agagatagcg actgggacag 
tagcatctag gagttcaggg gtcccagggc 
cagaggtcgg gggttcaggg atctcagggc 
tgatgacccg taagacagaa attatagttc 
cgggggtcca ggaagcagag actagagttg 
ccccagtcac tcagccaaga gttttaggat 
tacaagggtc agagactcaa gttctgagag 
tgtcagaggg caaatctggg gcttgggggg 
ctccagagaa caaatctggt acttttaagg 
atgagaaggg gaaagaagct gagggaagcc 
tggccagtgg ggcaggggct ggggcgccca 
acaggaggct gccgggcagc caggcaccac 
tggagtggtg ccaggaagtc accactggct 
catcctggcg caacggcttg gccttctgtg 
ttgactatgc ctcgctagac ccactcaaca 
gcttcgcggc tctgggcgtg tcgcggctgc 
tgcccgacaa gctcatcgtc atgacgtacc 
aggagctgca gctggtacaa ctggagggcg 
gcgcccagcc cagcccgccc gacgacctgg 
gtcacggggc cgaggggccc caggagccca 
ccccgggggt ggcctccagg aacgcggtcg 
aggccccccg agagtcgcga cccgcggagg 
gggcaccggg cggcggcggc gtgaggctgc 
cggtgccccc gccccgcgcg cacggctcct 
agaagaggcg ctcgcggctg cggaacagca 
gagccatggg agctgcggct gcagaaggcc 
cacccacagc tgcagactct caacagcccc 
ccccaagccc aggggaggag gctgggctgc 
gtgcagagct gcaggccctg gaacaggagc 
tggagatgca gctgaggagc ctcatggagt 
tgatccagga gtggttcacc ctggtcaaca 
agctgcagct gctcatggag gagcaggact 
agctgcgggc catgctggcc atcgaagact 
agcagctcct actggaggag ctggtgtcgc 
acctggacca caaggagcgg atcgccctgg 



ctgagccaag gtcaggaggc agagaggcaa 1380 
aggctgaaga gagttcagca gtttgtcaag 1440 
atgtggacac taagggacca gaggcgacag 1500 
cccctgaggc tcctccaagg ggctctcagg 1560 
ctccctcagg cctgagcctg cccccagcgg 1620 
acctcgaggg ggccagggct gctgcaggcc 1680 
gagcacctgg tattgagggg acaggcctgg 1740 
ccaggcccca ggtgagcagc tggcaggggg 1800 
ccaggggtct gggaggctgg gaggcagaag 1860 
ctgaggtggt agggttggag gtgctgggaa 1920 
tcccagagac taggacacta gaaattgaga 1980 
gatcaagggt cctggagtca gaggttgctg 2040 
aggaaacaga ggtgggggtc atagagaccc 2100 
agaaaacaga agctgggggt tcaggagttt 2160 
aggtactggt gacccaggag atatctgggg 2220 
tacagtctga gatgctgggg acccaggaga 2280 
cagaggctga ggggacagaa gctaaaatat 2340 
caggggtcag agagatagaa gcagagatag 2400 
tagaggtggg acttttgggg gttctgggaa 2460 
tggggaccca agagatagca tctagggatt 2520 
cagggatcca ggtgaaagag gttgggggtt 2580 
cagaaactga gatattgggg acccaagaga 2640 
tagaatctga ggtagctggg gcccaggaga 2700 
ccgaggctgg aatggcagag gcccgagtac 2760 
cagaggctga gaaggaagag gctcagactt 2820 
ggagtgctct caaatatgag gctttaaggg 2880 
cccaggaagc aaaagcagag atttcaggag 2940 
tccaggaggc agaggctggg gtttggggga 3000 
cccaggaagc agagatgaag gttttagagt 3060 
cccaggaagc ggaggctggg gtcttgggaa 3120 
tcacagaggc cagcctgcct gaagcacagg 3180 
gggcctcttc cccagagaag gctgaagagg 3240 
ctgccctggt cagctccagc cagtccctgc 3300 
accgtggcgt ccgcatcacc aacttcacca 3360 
ccatcctgca ccgattctac ccagacaiaga 3420 
tcaagcagaa caacaagcag gccttcgatg 3480 
tggagcccgc ggacatggtg ctactgtcgg 3540 
tgtgccagat ccgcgccttc tgcaccgggc 3600 
gcggcggcgc cggcacgtac cgcgtgggca 3660 
acgccggagg cctggcgcag cggctgcgcg 3720 
aggaggccgc agaccgcgca gacggggcgg 3780 
cgggccgcgc ctccaaggac ggcggggccg 3840 
tcccggccga ggggctggtg aacggggcgg 3900 
gacggccctc ggtcaacggg gagcccgggt 3960 
tctcccacgt gcgcgacgcg gacctgctca 4020 
gctcgttctc gatggacgat ccggacgcgg 4080 
aggcccctga ccccagccct gccccaggcc 4140 
ctggtgggag ttccccctcg gaggaaccac 4200 
aacggttcca ggacacaagt cagtacgtgt 4260 
agaggcagat agatgggcgg gcggctgagg 4320 
caggtgccaa caagctgcag gaggaggtgc 4380 
agaagaacgc tctcatccgg aggcaggacc 4440 
^ggagcgaag gttcgagctg ctgagccgcg 4500 
ggcagaaaac gtccgctcag cagcaccgag 4560 
tggtgaacca gcgcgatgag ctagtccggg 4620 
aggaggacga gcgcctggag cgcggcctgg 4680 
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aacagcggcg ccgcaagctg agccggcagt tgagccggcg ggagcgctgc gtgctgagct 4740 
gaggccgccg gcccgggtgg cccataactt ctcgcgtccc cggcgtccgc cgccgccccg 4800 
ggcctgcgct gcggacgacc cggccgtccc ggaggccgcg cgcgtgtccg ctaggggccg 4860 
ccggcgccct tccccgtata gggcagggcg gatccccgac cccacgggcg gg 4912 

<210> 29 

<211> 2241 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc„feature 

<223> Incyte ID No: 2119377CB1 



<400> 29 

cccacgcgtc cgcgaggtag cggtggcctg 
gccagaagct ctcggggagc ctcaagtcag 
ccaagcggga gctgcggggt gcagagcccg 
acatgccagc ggcggggctg gctgtgcagc 
cgctcaacgt cttcgtcaag gacgacgacc 
agagcaccga cggcatccgc ggcaag^rtgg 
tcaactggcc ggctcggcag cgcggcaccc 
ctcccctgca ctccgtgggc tacacggcgc 
gggacctggg ccgcagccgc ctctaccacg 
cggcctttct ggggcccgac gaggcctttg 
acatggatga gggcacactc agcttcatcg 
gaggtctcaa gggcaagaag ctgtacccgg 
tcaccatgcg ctacatcaac ggccttgacc 
ggagatccat ccgctcggcc ctgggccgcc 
tgcctcagtc tctcaaaaac tatctgcagt 
gacacagaca cacaccgcag ggcccgaccc 
gggaaaggat ctacccttct cctggctccc 
tgtggtacca actttggaaa cgaaaggtct 
agccctccca agtcagacac ctccttcgga 
ggaaatcctg ccaccaacca ggacacagca 
ttattcaaga gaatgaataa aacatttagg 
agacagggcc agggagaaat tagccagtgc 
atgaaagcct gacgccgtgc ccctcctggc 
cacaccgttt taggactctg tgccactttg 
acctggcaag tggacagtgc agtgtggaga 
ttacctaccc ctttccacgt gctccccttc 
ccagacacct ctcttcaggg aagatgagtc 
cttaccatta ttttaagtag catttgcatt 
tagtccaaag gtgctaacgg gggaactggg 
tgggttgggg ggccttcctg gagtgtcagg 
catggctgtc acaaagcttt atttgagcaa 
agtttgtttc ctgtttcaaa tgtgtgtgtg 
atgggggcag ccttgtgact ggaagggtgg 
agcctggtcc tctcgcagga atgctgctgc 
ttatatgttt cagacactct ctgcccagac 
gggatatcag agccccagac agcactgcct 
atgtgaattt gactgatgaa tgaagagcgt 
aaaaaaaaaa aaaaaaaaaa a 



cagcggcctc ctccccgcag tgaagcatgg 60 
tggaggtgcg agagccggcg ctgcggccgg 120 
ggcggccggc gcggctggac cagctgttgg 180 
tgcggcacgc gtggaacccc gaggaccgct 240 
ggctcacctt ccaccggcac cccgtggccc 300 
gccacgcccg cggcctgcac gcctggcaga 360 
acgctgtagt tggtgtggcc acggcccgtg 420 
tggtaggcag tgacgccgag tcgtggggct 480 
acggcaagaa ccagcccggc gtggcctacc 540 
cgctgcccga ctcgctgctc gtggtgctgg 600 
tggatggcca gtacctgggc gtggccttcc 660 
tggtgagtgc cgtgtggggc cactgtgaag 720 
ccgagcccct gccactgatg gacctgtgcc 780 
agcgcctgca ggacatcagc tccctgcccc 840 
accagtgagc caagcctgat gggcagcaca 900 
tcctgtcatt cacagtccca tggcacatag 960 
caggacactc agttctttca aagaccagga 1020 
cttgccaaca gtatctactg ccctcgaggc 1080 
gccacagaga gcctggagtc tgcacctcct 1140 
gccaccgtat tgatcagaga gcctgtttcc 1200 
caggagactt tctattgtgt gccccgttgc 1260 
agggggaaaa ttgcctctga ataatgaatg 1320 
ccatacgcct tgccagggcg gcaggattgt 1380 
agagactgtt cccaggaggc ccaaccgcag 1440 
caccttccgg cttacctctt tgaacgttgt 1500 
ccagccactg actcacagtt ctctggatgc 1560 
tgactggttt gccccaggga agagcgtatc 1620 
ttaaaagagg aatgcggaga ggacagtact 1680 
ggcattgtga cacccaagtc tgatgtgtcc 1740 
gtctctgggc aacgtccatt cagggtgcgg 1800 
attatttttt cactttagga gacttctaca 1860 
atgtgctgtt tatttatcag cttgaggtcc 1920 
atatgggaga cacattctct acctgctccg 1980 
tgcctccgcc gccactgctg ctgccacctc 2040 
tcatttttaa ctggaaatca tcacagcagt 2100 
cttcctcccc accccactgc ccccacctta 2160 
ttctaataaa gtttgtcatt cagtccttaa 2220 

2241 



<210> 30 
<211> 1853 
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<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 3176058CB1 



<400> 30 

gaacgggacc gctttcccgg aagtgcttgc 
tctctggttt cctaatcagg gcaacgccgc 
gttctcggtg ccactccctg gcagggcggg 
cgtagtagcg caaggcgcag agtggacctt 
gccgggtccc aaagggcaga atggacgggc 
ttgcagaaaa cagtcgggat gtgtttattg 
tgctgctggc caaggcggcg gggccagagc 
agctgaaccc cagggcggcc gacgaggccg 
tcaacttctc cttttggtcg gagcaggacg 
aaacatacag tgggtactgg tccctgtgcg 
taccaataac tagtgcctcg tactacgcga 
ttcgttctga cacagacgtt tccatgcctt 
aaaccgggaa aattctgctg gagaagtttg 
gtgagaatag tgcgcagaag ttaatgcacc 
atgtgactct gtttgagggg aaaagagttt 
cagatacgtg gagtgtattg gaaggaaaag 
tcaccatgtt tgctgattat agattacctc 
actctgatga cctactgaag aagcttctca 
aagaggtgga aatcagaggg tgctcgcttt 
tggagcttat tgaacaaaag ggtgaaaaac 
attattactt atgggactat gcccatgacc 
atcgcatacg ttgcatatat tattgacctc 
gcgttttata tcatatcatc tgtacagttt 
ttataggaaa ttgattgccc attctcactt 
tttatgcctg taatcccagc actgtgggag 
agtttgagac cacctggcca acatggtgaa 
ctgggcgtgg tggcacgtgc ctgtaatccc 
gcttgaactc agaaggtgga ggttgcagtg 
cagcaacaag agtgaaactt tgtctcaaaa 
acttcctagg ccagatgtgg tggctcatat 
caggaggatt gcttgaggcg aggagttcaa 

<210> 31 
<211> 2541 
<212> WK 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 2299818CB1 



ggcctctgcc cagcgagctg ccccggggtc 60 
gggagagaac ctttaccttg gctgcactaa 120 
accttgttta ggccctgtga tcgcgcggtt 180 
gacccgccta gggcgggaag agtttggccc 240 
tcctaaatcc cagggaatcc tctaaattca 300 
acagcggagg cgtacggagg gtggcagagc 360 
tgcgcgtgga ggggtggaaa gcccttcatg 420 
cggtcaactg ggtgttcgtg acagacacgc 480 
agcacaagtg tgtggtgagg tacagaggga 540 
ccgccgtcaa • cagagccctc gacgaaggga 600 
cagtgaccct ggatcaggtt cggaatatac 660 
tagtagaaga gaggcatcgg attctcaatg 720 
gaggctcttt tctcaactgc gtccgagaaa 780 
tggtggttga aagttttcct tcttacagag 840 
ctttttacaa acgagcccaa atccttgtag 900 
gagatggctg cttcaaggac atctccagta 960 
aggttcttgc tcatcttgga gccctgaaat 1020 
aaggagaaat gctctcatat ggagacaggc 1080 
ggtgtgttga gctgatccgg gattgtcttc 1140 
ctaatggaga gatcaattcc attcttctgg 1200 
atagggaaga tatgaaagga attccgtttc 1260 
aagtgtaaac tgatccaaag aaaaccccct 1320 
tgctttgata tttagagaac atgatcgagg 1380 
gaaaaatact tcctaggccg ggcacagtgg 1440 
gctgaggcgg gtggatcatc tgaggtcagg 1500 
accccatctc tactaaaaat acaaaattag 1560 
agttacttgg gaggctgagg caagagaatc 1620 
aggcgagatt aggccattgc actccagcct 1680 
aacaacaaca acaacaacaa caacaacaaa 1740 
gtataatctt agcactttgg gaggccaagg 1800 
gaccagccta ggcaacatag gga 1853 



<400> 31 

aaaacattct tggccaaaat ctaggggaag 
caaagacacc atggatgcta ttaatgtact 
attctcctac atgggaacca aagataaaag 
caaaataact gcacaaagac ttgcccacct 
gaatttcagc tatcaaaaaa acccactgaa 
tgttgttctc agaaatataa caggaactga 



ttactgccac ttcgtactat ataaggaaaa 60 
ctccaaatac ttaagagtca agccaaatat 120 
ggctataaca gttcaagaaa ttgctgttct 180 
gaataagtgc ttgatgaact ttaagctagg 240 
attgggagag cttcaaggaa accacttcac 300 
tgaccaagta cagcaagcta tgaactctct 360 
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caaggagatt ggatttatta actactatgg aatgcaaaga tttggaacca cagctgtccc 420 
tacgtatcag gttggaagag ctatactaca aaattcctgg acagaagtca tggatttaat 480 
attgaaaccc cgctctggag ctgaaaaggg ctacttggtt aaatgcagag aagaatgggc 540 
aaagaccaaa gacccaactg ctgccctcag aaaactacct gtcaaaaggt gtgtggaagg 600 
gcagctgctt cgaggacttt caaaatatgg aatgaagaat atagtctctg catttggcat 660 
aatacccaga aataatcgct taatgtatat tcatagctac caaagctatg tgtggaataa 720 
catggtaagc aagaggatag aagactatgg actaaaacct gttccagggg acctcgttct 780 
caaaggagcc acagccacct atattgagga agatgatgtt aataattact ctatccatga 840 
tgtggtaatg cccttgcctg gtttcgatgt tatctaccca aagcataaaa ttcaagaagc 900 
ctacagggaa atgctcacag ctgacaatct tgatattgac aacatgagac acaaaattcg 960 
agattattcc ttgtcagggg cctaccgaaa gatcattatt cgtcctcaga atgttagctg 1020 
ggaagtcgtt gcatatgatg atcccaaaat tccacttttc aacacagatg tggacaacct 1080 
agaagggaag acaccaccag tttttgcttc tgaaggcaaa tacagggctc tgaaaatgga 1140 
tttttctcta cccccttcta cttacgccac catggccatt cgagaagtgc taaaaatgga 1200 
taccagtatc aagaaccaga cgcagctgaa tacaacctgg cttcgctgag cagtaccttg 1260 
tccacagatt agaaaacgta cacaagtgtt tgcttcctgg ctccctgtgc atttttgtct 1320 
tagttcagac tcatatatgg atttcaaatc tttgtaataa aaattatttg tatttttaag 1380 
tttttattag cttaaagaaa taatttgcaa tatttgtaca tgtacacaaa tcctgaggtt 1440 
cttaatttta gctcagaata taaattagtc aaaatacact tcaggtgctt aaatcagagt 1500 
aaaatgtcag ctttacaata ataaaaaaag gactttggtt taaagtagca ggtttaggtt 1560 
ttgctacatt ctcaaaagac agcaggagta tttgacacat ctgtgatgga gtatacaaca 1620 
atgcatttta agagcaaatg caacaaaaca aatctggact atggataaat aatttgagag 1680 
ctgccaccca caaatataaa tacagtactc atgctgactg aaataataag acatctacaa 1740 
atttataaac aaaaagtgat tgtcattatc ctgcttatgt actagattca ggcaagcatt 1800 
atagactttt tggttgcggt ggcttttgca tttatattat caatgccttg caggaacgtt 1860 
gcattgatag gcccatttta tttttttatt ttfctttttcg agacaggatc tcactctgta 1920 
gcacaggctg gattgcagtg caatcctgca attctcaatc ttgcactgca gcctcgacct 1980 
cccaggctcc agtgactctc ccacctcagc ctcctaagta gctgggagta caggcgcgca 2040 
ccaccacgcc tagctgattt ttgtattttt ttgtagagac gggggtttgg ccatgttgcc 2100 
gaggctaact cctgggatta caggcatgag ctgtgctggc cgggtttttt tttcttgatg 2160 
taaacgtgta cagctgtttt attagttaag gtctaatttt tactctaggt gccttttatg 2220 
ttcagaactc tttccactgg actggtattt gctcaaaaat aaataatggt agagaagaaa 2280 
actataaaaa tggacaaggc tttcttctat cagtagcgtt taccctttgt caccagtggc 2340 
tttggtattt ccatgtctgg cattgcataa acttctctgg tgtgaaagga taaatatgcc 2400 
tttctaaagt tgtatatcaa aattgtatca atttttattt tctatgattt ctagaaacaa 2460 
atgtaataaa tatttttaaa atctcctttc tactggttat gtaaataaat caaataaata 2520 
tatcaaaaaa aaaaaaaaaa a 2541 

<210> 32 
<211> 4144 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 2729451CB1 

<400> 32 

gtcgagatgg agcccaactc actccagtgg gtcggctcac cgtgtggctt gcacggacct 60 

tacattttct acaaggcttt tcaattccac cttgaaggca aaccaagaat tttgtccctt 120 

ggcgactttt tctttgtaag atgtacgcca aaggatccga tttgcatagc ggagctccag 180 

ctgttgtggg aagagaggac cagccggcaa cttttatcca gctctaaact ttatttcctc 240 

ccagaagaca ctccccaggg cagaaatagc gaccatggcg aggatgaagt cattgctgtt 300 

tccgaaaagg tgattgtgaa gcttgaagac ctggtcaagt gggtacattc tgatttctcc 360 

aagtggagat gtggcttcca cgctggacca gtgaaaactg aggccttggg aaggaatgga 420 

cagaaggaag ctctgctgaa gtacaggcag tcaaccctaa acagtggact caacttcaaa 480 
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gacgttctca aggagaaggc agacctgggg 
ctcagctacc cccagtactg ccggtaccgc 
tcttccattc taacggacca gtttgcattg 
aaccctcaga tcctgtactg tcgggacacc 
agtatatgcg atgagtttgc gccaaatctt 
ccacaaagaa gagattcatt cagtggtgtt 
gccgttgcca aggtgaaatg tgaggccagg 
aactgtaaaa aagtctcaaa tgaagaaaaa 
gcagatgaac aagccttctt ggtggcactt 
atagaacgaa taccctattt aggttttaaa 
gctcaaaaac tgggaggata tgaaacaata 
gatgaattag gcggtaatcc tgggagcacc 
gaaagattaa tcctaccata tgaaagattt 
ccaatcaaac ctcggaaaca ggagaacagt 
tctggaacca aacgcatcaa acatgaaata 
ccaaagcccc aggatgcagc agaggtttca 
ataagccaga aaagcatccc tgagcctctc 
gggtatcagg aattttcagc gaagcccctg 
gaaacagacc aaggttcaca cagtgagaag 
acacctccac tcccaagtgc tcctctggcc 
gccagcaaac agccactcac ctctcctagt 
ctgtgctgtt ttacagagag ccctgaaagt 
cagccaccgc tggcaaacca gaatgagacg 
tacattgcca actgcaccgt gaaggtggac 
ctcaagcaga ccccaaaggt ccttgtggtc 
ctgactgggc ccatgaacga gaaccatgga 
ggcaacccag gcatcatgtc cccactggcc 
gccagcctct ccagcagcta cccttatggc 
ctgattgcta gggatgactt gtgttccagt 
gaccatatgg cggtcagccg gccatcagtg 
ccctcggaag agagaaagac catcaatgac 
gatccccacc gctgcagctt ctccaagcat 
ctgaagcaag aaattcagga gggcaaggat 
tcccacatgc ctagcttcct ggctgacttc 
agacacaccg agcaccatct tcataatgaa 
tacagggaat cggaaaacag ttcttttcct 
aattatctca cgtccctgca cctgcaagac 
gatgatcagc ctacagatct gagccttccc 
ctgggcctgg ctcattccac cacagggccc 
gtcttaggca gccagagtcg agactgtcac 
atgtcaggcc ctaaaaaata ccctgaatcg 
agactggaga atttcaggaa gatggaaggc 
agcccgcaga acattggggc ggcgcggccg 
gtgattgcag ggaaaaaggc ccgggcagtg 
gggaaggaga aggcctctga gcaggagagt 
tccgggggcg gatcagaagg ccacaagctt 
tattccggga gcctgtgtaa ctcgggcctc 
tctctgcagt acttgaaaaa ccagactgtg 
cactcgcttg tgatgcaaag aggaattttt 
agacacttgg ctgcggctac acctgtagga 
atttaccctt tagctgctat aaatcctcaa 
gtgcacccca gtacaaaact gtaggctcag 
aacagagctt cactccttac ccaggagtgc 
tctaatctga ggctatgatc agtcccagct 
tgatttttgt gggacaactg tagcccacaa 
ccaacacaga tgcccaggca cctccagatc 



gaggacgagg aagaaacgaa cgtgatagtt 540 
tcgatgctga aacgcatcca ggataagcca 600 
gccctggggg gcattgcagt ggtcagcagg 660 
tttgaccacc cgactctcat agaaaacgag 720 
aaaggcagac cacgcaaaaa gaaaccatgc 780 
aaggattcca acaacaattc cgatggcaaa 840 
tcagccttga ccaagccgaa gaataaccat 900 
ccaaaggttg. ccattggtga agagtgcagg 960 
tataaataca tgaaagaaag gaaaacgccg 1020 
cagattaacc tttggactat gtttcaagct 1080 
acagcccgcc gtcagtggaa acatatttat 1140 
agcgctgcca cttgtacccg cagacattat 1200 
attaaaggag aagaagataa gcccctgcct 1260 
tcacaggaaa atgagaacaa aacaaaagta 1320 
cctaaaagca agaaagaaaa agaaaatgcc 1380 
tcagagcaag aaaaagaaca agagacttta 1440 
ccagcagcag acatgaagaa aaaaatagaa 1500 
gcatccagag tagacccaga gaaggacaac 1560 
gtggcagagg aggcgggaga gaaggggccc 1620 
ccagaaaaag attcagcctt ggtccctggg 1680 
gccctggtgg actcaaaaca agaatccaaa 1740 
gaaccccaag aagcatcctt ccccaccaca 1800 
gaggatgaca aactgcccgc catggcagat 1860 
cagctgggca gtgacgacat ccacaatgcg 1920 
cagtcgtttg acatgttcaa agacaaagac 1980 
cttaattaca cgcccctgct ctactctagg 2040 
aagaaaaagc ttttgtccca agtgagtggg 2100 
tccccacccc ctttgatcag caaaaagaaa 2160 
ttgtcccaga cccaccatgg ccaaagcact 2220 
attcagcacg tccagagttt cagaagcaag 2280 
atctttaagc atgagaaact gagtcgatca 2340 
caccttaacc cccttgctga ctcctacgtc 2400 
aaactcttag agaaaagggc cctcccccat 2460 
tactcgtccc ctcatctcca tagcctctac 2520 
cagacatcca aatacccttc cagggacatg 2580 
tcccacagac accaagaaaa gctccatgta 2640 
aaaaagtcgg cggcagcaga agcccctacg 2700 
aagaacccgc acaaacctac cggcaaggtc 2760 
caggagagca aaggcatctc ccagttccag 2820 
cccaaagcct gtcgggtatc acccatgacc 2880 
ctttcaagat caggaaaacc tcaccatgtg 2940 
atggtccacc caatcctgca ccggaaaatg 3000 
atcaagcgca gcctggagga tttggacctt 3060 
tctcccttag acccatccaa ggaggtctct 3120 
gaaggcagca aagcagcgca cggtgggcat 3180 
cccctctcct cccctatctt cccaggtctg 3240 
aactccaggc tcccggctgg gtattctcat 3300 
ctttctccac tcatgcagcc cctggctttc 3360 
acatcaccga caaattctca gcagctgtac 3420 
agttcatatg gggacctttt gcataacagc 3480 
gctgcctttc catcttccca gctgtcatcc 3540 
ctctgcccag cagtccaaag cggcatggcc 3600 
tggcttatag agttagaagt cagtatttct 3660 
gtaggggccc agaggggagg tgaacatgcc 3720 
actgactggc tggtgagtct tgactccctt 3780 
attcacttcg cacgtgggcc ttgtgaaggg 3840 
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atttgtgaat atccaggaag aacttagagg 
aatctgggca aaaaagaggc aggcatttca 
atggcaaggg atgcccctct ttttcataaa 
agtgaaactt caatagatct ttcattttga 
tgttaaattg actgtatata tttgcttctt 
aatt 

<210> 33 
<211> 5218 
<212> DNA 

<213> Homo sapiens 



accccatctg agttcggatg gtcaggaaac 3900 
aaggaagggg caaggaagac tggcaaacag 3960 
actctccaag gttcaatcaa tgcaatgtat 4020 
cactattaaa caatccagag aagtaaacac 4080 
aaaactacct gtatcactgt ttgctcacct 4140 

4144 



<220> 

<221> icdsc_f eature 

<223> Incyte ID No: 878534CB1 



<400> 33 

attcgcgcgc gccttcccta gccacccggg 
gcgccccttg cgggaagtga gggtgggttg 
cttcgggttt cgcctctcgg ctccctctgg 
gaagtccgcc tgagacttgg gtcaaggagt 
aatccggtga gcggtaagga aagtgatgcc 
agcaagaaca catccactcc agagatacct 
atacttggta accgagggaa ttactaagac 
atcctgacac tatgaatgct acttggatgc 
aggggccgtg gagctggcct cggcctcggc 
gtgctgaatg gctgcgatgg cgcccgctct 
ccggttcaaa ctggctcccc catcctctac 
caacgccaac atccttattg ctgccaacgg 
tcccagccta gatttccgaa ataatcctac 
ggtggcatct tatctctgct ctgatgtaac 
gcaaggggtc ttcagcaagc agacagtcct 
tgaactccga gctgagctgt tggggagaca 
tagaaccatg aatacgagtg gtcagacagc 
taagaaattg actaaaagtt caacacattc 
gggaaaacgg gctctcactt catctgctct 
tggggacttg aaggggggta tgaccaattg 
acacacaatt ttgtatagca ataatagcac 
acagccggca cttcaaggaa gcagtagatt 
ggggggtgtc aaattggagg gtaaaaagtc 
agattctgac acaaggataa cagctttact 
ccgcagatta caaaagcgct tacaggttgt 
acatcagctg ggtggatttt tggagaagac 
gagaccacgg agccagttga tgctgactcg 
cagtgagacc accacttcag agggacttag 
agaattggag agatttacag ctagtggcat 
tgattcagat gtcactgaca gtagttcagg 
gaccagagct gatcccgagc agcgtcatgt 
ggctgcagac cgggcagcta ttgtcagccg 
cttggaatat cgaattcgtc agcaaacaga 
gttgatagtt cttggggagg tacctccccc 
tagttctgag gtgaagacag atcatgggac 
attggaaaac catggtgccc ctattattgg 
atgtggagca ctcagacctg tcaatggagt 
ccacattcca ggtgacagct ctgatgctga 
tctcgtctct tcatcatctg atggcacctg 



gttgcctcct aacatggaat ggccaaagga 60 
ggactgggtc cgcgttgggg gaggtgcaat 120 
ctctggagtt gggacccctt gtgggctctg 180 
caaactgtcg ccccccgctc ctcccccaga 240 
aagtcttcga agcctcagtg acaaacgcat 300 
tctcgaaaca aaagattttc ctacctgctt 360 
ttcttgctca tttctgagta ttgtctttat 420 
ctcttaagtc tgttctctgg ggaggcagta 480 
atcgggagag gctggacttc ctgtctctct 540 
cactgacgca gcagctgaag cacaccatat 600 
cttgtcccct ggcagtgccg aaaataacgg 660 
aaccaaaaga aaagccattg ctgcagagga 720 
caaggaagac ttgggaaagc tgcaaccact 780 
atctgttccc tcaaaggagt ctttgaagtt 840 
taaatctcat cctctcttat ctcagtccta 900 
gccagttttg gagttttcct tagaaaatct 960 
tctgccacaa gcacctgtaa atgggttggc 1020 
tgatcatgac aattccactt ccctcaatgg 1080 
tcatgggggt gaaatgggag gatctgaatc 1140 
cactcttcca catagaagcc ttgatgtaga 1200 
tgcaaacaaa tcctctgtca attccatgga 1260 
atcacctggt acagactcca gctctaactt 1320 
tcccctgtct tccattcttt tcagtgcttt 1380 
gcggcgacag gctgacattg agagccgtgc 1440 
gcaagccaag caggttgaga ggcatataca 1500 
tttgagcaaa ctgccaaact tggaatcctt 1560 
aaaggctgaa gctgccttga gaaaagctgc 1620 
caactttctg aaaagcaatt caatttcaga 1680 
agccaacttg aggtgcagtg aacaggcatt 1740 
aggggagtct gatattgaag aggaagaact 1800 
acccctgaga cgcaggtcag aatggaaatg 1860 
ctggaactgg cttcaggctc atgtttctga 1920 
catttacaaa cagatacgtg ctaataaggg 1980 
agagcataca acagacttat ttcttccact 2040 
tgataaattg attgagtctg tttctcagcc 2100 
tcatatttca gagtcactgt ctaccaaatc 2160 
tattaacact cttcagcctg tcttggcaga 2220 
ggaacaatta cataagaagc aacgactgaa 2280 
tgtggcagcc cggacacgtc ctgtactgag 2340 
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ctgtaagaag cggaggcttg ttcgacccaa 
ccggaacagc acaatccgcc ctggctgtga 
aggcagcatc aacaccatgc ctcccgaaat 
ttcccagttg gactcttgtg ttcatcctgt 
cctgcatttc cagagcatgc tgaaatctca 
acctcccaaa aagttatcgc ttaagcacag 
agctcgtaag gacaggcaca aattggtcag 
tcacacagac atgagcagtt cgagctactt 
cttggtgcga cagctctcca cctcctcaga 
ggttacagcc agcacatcgc agcagccagt 
tattaacaac attgtcatcc caatgtctgt 
atacaaggaa atccttacgc ccagctggcg 
tcctgatgag gagaatgaag agattgagga 
tgccaaatgt gaggagatgg agagggcacg 
gcggcggggc agcaggtcct acaggtcatc 
tgccaacccc tccacccccc agcctgcctc 
agaatactcc catggtcagt cccctaggag 
cctcacccct gtggctcggg acactctgcg 
cacaccagag ctggggctgg atgaacagtc 
cctggcgcac agtccccagg cggagtgtga 
ccgctgcact cgacgcacct caggcagcaa 
ctcgcctccc attgtccccc tcaagagtcg 
cccgactcac agatgagcgg gagacagcca 
aagcttcaga aatctctgcg tttgatattc 
tttagtgaac ttaaggaatt tagatcctac 
tgttttgttt ttgtttccat gttttcttgt 
aacagaagtt caggatgaga ccctgctggc 
atcactggac ttactgatct tagatgacca 
aacagcctgg cgggctacag tttagcatgg 
gtgctctgga ctcttacccc ctcccctccc 
tggctgggtc cccttggttt ttcgtgctgg 
ctgccagctg ctctccccga gtgctcagct 
ttaagagact gattttttgt ttcatctgca 
gcagtaacct ggctgtggct gctcaggttc 
gtctgctctc ccatgccatg tacacaccca 
ccacctatcc tgatctttga aggtagggtt 
tgcaaactgg gggttgtggg aagtgagcag 
ccctctgatt cactttgcca tgtttccttc 
ctcttggtct ttcacgctcc ccttgcctgt 
cattgttccc ccttcaccct tctctgttaa 
accaaaaagg gggagggggg agaagactct 
aatagtagac aactgcttaa tggttggggt 
tttttctgtt tgcaagttaa agggtttgtc 
caccatatcc ctatgcataa agtgcttctt 
ccttttagca ttgaaaaaaa aaacaaaaac 
gtaaatattt tcttttcctg ctttggagct 
ttgttctata gaaatgcttt tcttcagaga 
tcgcaaagca cagactagtg cttaaaaaaa 

<210> 34 
<211> 763 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc^feature 



cagcatcgtt cctctttcca agaaggttca 2400 
tgtgaatccc tcctgcgcac tgtgtggttc 2460 
tcactatgaa gcccctctgt tggaacgtct 2520 
tctagcattt ccagatgatg ttcccacaag 2580 
gtggcagaac aagccttttg acaaaatcaa 2640 
agcacccatg ccgggcagtc tgccagattc 2700 
ctccttccta acaacagcca tgttgaagca 2760 
ggcagccacc caccatcctc cacacagtcc 2820 
ttcccctgca cccgccagct ctagctcaca 2880 
aaggaggaga aggggagaga gctcatttga 2940 
tgctgcaaca actcgcgtag agaaactgca 3000 
ggaggttgat cttcagtctc tgaaggggag 3060 
cctatccgac gcagccttcg ccgccctgca 3120 
gtggctgtgg accacgagtg tgccacccca 3180 
agacggccgg acaacccccc agctgggcag 3240 
ccctgatgtc agcagtagcc actctttgtc 3300 
ccccattagc ccggaactgc actcagcacc 3360 
acacttagcc agtgaggata cccgttgttc 3420 
tgtccagccc tgggagcggc ggaccttccc 3480 
ggaccagctg gatgcacagg agcgagcagc 3540 
gactggccgg gagacagagg cagcgcccac 3600 
gcatctggtg gcagcagcca cagctcagcg 3660 
tctaaacaga ctcactaact attggcatta 3720 
aaacatcata tgccggaaat tttcacagtt 3780 
tttggtattt ttttttcttg ttttaatttt 3840 
cacacacctg agcacttcct cccgttggca 3900 
ctggtcctgg. cacatcctct gcactgttga 3960 
ccccctccct cacacctgtg ggcagggcag 4020 
ccttcttgag ctagggtgga atggggcagg 4080 
atctgtggct tggctctgct gtggccctcc 4140 
aacatcccca ccagagcctc tctgccataa 4200 
ggcagaacac ctttccttct cacccagaac 4260 
tttggtcttc tctgttttga ctctttcact 4320 
ccctcctcat gccccttggt acccttccct 4380 
caacccgtcc ttccacttgg aatattttta 4440 
aggactactt aacctctatt cccactcccc 4500 
ccatctccct gtgtgatttt tttttttttt 4560 
acatccagat ccctgtcggt gttagttcca 4620 
ggaacattgt ctggtcctag ctgtggttcc 4680 
ccttgtgcct gtctcctgta tgatcacatc 4740 
ttttttttgg ccattttgta atcgtataaa 4800 
tttttcacaa ttttcaacat tagtgatttt 4860 
attgtttctt taaaaaaaaa tacaataatg 4920 
ctatttataa ggttgaaaat tctgaataac 4980 
aaaaaatgga aaaaaaaaac cttgtatttt 5040 
gtgtaatggc agcgaaacat gtagctgtct 5100 
agctgatctt tgttaatgtc ttgattctgt 5160 
aaaaagaagg aaaaattgaa aaaaatat 5218 
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<223> Incyte ID No: 2806157CB1 
<400> 34 

ggcaaaacag tgacgcagca gtgtgttacc tgccgacagc ataatgtgag gcaaggtcta 60 
gctgttcccc cccggcatac aagcttatag agcagcctcc tttgaagatc tccaggtgga 120 
cttcacagag atgccaaagt gtggaggtgt tcgagtgtgg atcaaggact ggaacgtagc 180 
ctctttgtgc ccatggtgga aaggacccca gactgtcgtc ctgatcactc ccactgctgt 240 
gaacgtagag agaatcctag cctggatcca tcacaaccgt gtaaaacctg cagcgcctga 300 
atcctgggag gcaagaccaa gtctggacaa cccctgcaga gtgaccctga agaagatgac 360 
aagccctgct ccagtcacac ccagaagctg actggtccac gcacagccga agcatgagga 420 
agctcattgt gggcttcatt tttcttaaat tttggactta cagtaagggc ttcaactgtt 480 
cttactcaaa ctggggacta ttcccagtgt attcatcagg tcagtgaggt aggacagcaa 540 
atgaaaacaa tctttctgtt ctatagttat tatcaatgta tgggaacgtt aaaagagact 600 
tgtttgtata atgccactca gtacaaggta tgtagcccag gaaatgactg acctgatgtg 660 
tgttataacc catctgagcc ccctacaacc accagttttg aaataagatt aagaactggc 720 
cttttcctag gtgatacaag tgaaataata actagaacag aag 763 

<210> 35 

<211> 869 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 5883626CB1 

<400> 35 

gcgagaaggg gagtggaagg caggggctga agacacaggc caggcggaat gaagatgatg 60 
gtggtcttgc tcatgctgtc ctcgctcagc cggctcctgg gcctcatgag gccatcatct 120 
ctcaggcaat acctggactc tgtgcccttg ccaccctgcc aggagcaaca gccaaaggct 180 
agtgccgagc tagaccacaa ggcctgctac ctgtgccaca gcttgctgat gctggccggg 240 
gtagttgtta gctgccagga catcactcca gaccagtggg gcgagctgca gctgctgtgc 300 
atgcagttgg accgccacat cagcacgcag atccgggaga gcccccaggc catgcaccgc 360 
accatgctca aggacctggc tacccagacc tacatccgtt ggcaggagct gctgacccac 420 
tgccagcccc aggcccagta tttcagcccc tggaaagaca tctaaaggga cagggtcagg 480 
gcagcccagg gctcctggct tcagcaggaa gtgaacaggc tcagggaact ggaggaagcg 540 
aagcatcaag gccagaggag gccacatgct gaccagcctg atgaggcaag agcctgcccc 600 
tgccaccgcc ccgacccctc tcctctctgc aagagcctgc ctctgccacc gccccgaccc 660 
cctctcctct cagcaaggga tgggcctctc tgcctcgccc acccctcagc cctcctccca 720 
gccatctcct cttccctaag gcctctgtct ccatagctct ggtttccctg ggcctcagtc 780 
ctccccaccc tccttcctct gtctccctgt cactaatgtg aggtttcttt gtgcacatta 840 
aagtcttctt tcagcaaaaa aaaaaaaaa 869 

<210> 36 
<211> 2875 
<212> DNA 

<213> Homo sapiens 
<220> ' 

<221> misc_feature 

<223> Incyte ID No: 2674016CB1 

<400> 36 

ggggcgccat cttgtcttgt tcccgaagaa gtagaagcat cgaaagcgtt ggagaggtgt 60 
taccggaacg gcggcgacaa gggtgttccc gaactagagt ggggcataca taatcttgct 120 
gctatgcttc gaagctgtag tctgaatcaa cctaagtttt aaacagaagg tgaacctctg 180 
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agatagaaaa tcaagtatat tttaaaagaa 
cagcagtggc ccttgaacca gcaacaatgg 
agccagattg attgggctgc attggcccaa 
cagcaaagca tggtagaaca accaccagga 
atggaatctg gtccaaacaa tcatgggaat 
tggcaaccag aatggggaat gcatcagcaa 
atgccaccaa caccaggccc aatggacatt 
gacagtgggg aatttgcccc tgacaacagg 
ggtggaccac ccgataattt tgcagtgggg 
gctgcttttg gtccaccgca aggtggattt 
ggacctccag cacctcccca gaatcgaaga 
cgttcaccta ttgcacttcc tgtgaagcag 
aggactcttc ccgcttggat tcgcgaaggt 
aaattggaga aagaaagaat ggaacaacaa 
gccacagaag atgctgaagg aggggatggc 
agtgatgagg aagaagaaga cactgaaaat 
agaagtccat ccccagttcc tcaagaagag 
aaagagtatc aaatgatgtt gctgacaaaa 
acagatgaag aaatttatta cgtagccaaa 
gcaaaacagc tggcacagtc cagtgcactg 
ggttatggat caggagacag tgaagatgag 
actgatgatg aagaattacg gcatcgaatc 
gaaaaagaac agcagctatt acatgataaa 
agggttacaa aagagatgaa tgaatttatc 
gaagcaagag aagcagacgg tgatgtggtt 
acatcagttt tagaaccaaa aaaagagcat 
tcgggaagtt ctagtagtgg tagttccagt 
actgtctcta gctcttcata cagttctagc 
tcttctccta aaaggaaaaa gagacacagt 
cgtagcagga gtagaagcta ttctcgcaga 
aagattagag atagaaggag atctaataga 
cggagtcctt cccgagagag acgtagaagt 
cgtgccagtc gcagtaggag tcgagatagg 
agtgggaaca gtcataagca taaaggtgag 
agtcgaagta tagataaaga taggaaaaag 
aaaagaaaag agaaacaaaa aagggaagaa 
agattaaaaa ggaaacgaga aagtgaaaga 
aaaatcataa gacatgattc tagacaggat 
aaacattcag gctctgattc tagtggaagg 
gaaaagaagg ctaagaagcc taaacatagt 
tctggtaaga aggcaagccg caaacacaag 
taaagtattt tgtcCgattt ttaaaaaaaa 
cctttctctc tctctttaat aaactcagtt 
tgttagaaat cctatataat attatttatt 
tttttatttt taaattttgt cttttccctt 

<210> 37 

<211> 1839 

<212> UNA 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 5994159CB1 

<400> 37 



gggatgtggg atcaaggagg acagccttgg 240 
atgcagtcat tccagcacca acaggatcca 300 
gcttggattg cccaaagaga agcttcagga 360 
atgatgccaa atggacaaga tatgtctaca 420 
ttccaagggg attcaaactt caacagaatg 480 
cccccacacc cccctccaga tcagccatgg 540 
gttcctcctt ctgaagacag caacagtcag 600 
catatattta accagaacaa tcacaacttt 660 
ccagtgaacc agtttgacta tcagcatggg 720 
catcctcctt attggcaacc aggacctcca 780 
gaaaggccat catcattcag ggatcgtcag 840 
gagcctccac aaattgacgc agtaaaacgc 900 
cttgaaaaaa tggaacgtga aaagcagaag 960 
cgttcacaat tgtccaaaaa agaaaaaaag 1020 
cctcgtttac ctcagagaag taaatttgat 1080 
gttgaggctg caagtagtgg gaaagtcacc 1140 
cacagtgacc ctgagatgac tgaagaggag 1200 
atgcttctaa cagaaattct gctggatgtc 1260 
gatgcacacc gcaaagcaac gaaagctcct 1320 
gcttccctca ctggactcgg tggactgggt 1380 
aggagtgaca gaggatctga gtcatctgac 1440 
cggcaaaaac aggaagcttt ttggagaaaa 1500 
cagatggaag aagaaaagca gcaaacagaa 1560 
cataaagagc aaaatagttt atcactacta 1620 
aatgaaaaga agagaactcc aaatgaaacc 1680 
aaagaaaaag aaaaacaagg aaggagtagg 1740 
agcaatagca gaactagtag tactagtagt 1800 
tcaggtagta gtcgtacttc ttctcggtct 1860 
aggagtagat ctccaacaat caaagctaga 1920 
attaaaatag agagcaatag ggctagggta 1980 
aatagcattg aaagagaaag acgacgaaat 2040 
agaagtcgct caagggatag acgaaccaat 2100 
cgtaaaattg atgatcaacg tggaaatctt 2160 
gctaaagaac aagagaggaa aaaggagagg 2220 
aaagacaaag aaagggaacg tgaacaggat 2280 
aaagatttta agttcagtag tcaggatgat 2340 
acattttcta ggagtggttc tatatctgtt 2400 
agtaagaaaa gtactaccaa agatagtaaa 2460 
agcagttctg agtctccagg aagtagcaaa 2520 
cgatcgcgat ccgtggagaa atctccaagg 2580 
tctaagtccc gatccaggta gtatactttt 2640 
ttgactgaat ttattccaag ttgaaagtgt 2700 
tggtacttga taaataatca tagtcttaaa 2760 
taaaattgca gatttttaat ttaaaataca 2820 
tttttttcag atcacaaccc ctccc 2875 
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ctcgaggggc cctctccctg ctggcacctg ggagccatgc atgaatcaag gagtcgctgg 60 
acagagcctg ggtgttccca gtgctggtgc gaggacggga aggtgacctg tgaaaaggtg 120 
aggtgtgaag ctgcttgttc ccacccaatt ccctccagag atggtgggtg ctgcccatcg 180 
tgcacaggct gttttcacag tggtgtcgtc cgagctgaag gggatgtgtt ttcacctccc 240 
aatgagaact gcaccgtctg tgtctgtctg gctggaaacg tgtcctgcat ctctcctgag 300 
tgtccttctg gcccctgtca gaccccccca cagacggatt gctgtacttg tgttccaggc 360 
agatggctcg gtgagctgca agaggacaga ctgtgtggac tcctgccctc acccgatccg 420 
gatccctgga cagtgctgcc cagactgttc agcaggctgc acctacacag gcagaatctt 480 
ctataacaac gagaccttcc cgtctgtgct ggacccatgt ctgagctgca tctgcctgct 540 
gggctcagtg gcctgttccc ccgtggactg ccccatcacc tgtacctacc ctttccaccc 600 
tgacggggag tgctgccccg tgtgccgaga ctgcaactac gagggaagga aggtggcgaa 660 
tggccaggtg ttcaccttgg atgatgaacc ctgcacccgg tgcacgtgcc agctgggaga 720 
ggtgagctgt gagaaggttc cctgccagcg ggcctgtgcc gaccctgccc tgcttcctgg 780 
ggactgctgc tcttcctgtc cagattccct gtctcctctg gaagaaaagc aggggctctc 840 
ccctcacgga aatgtggcat tcagcaaagc tggtcggagc ctgcatggag acactgaggc 900 
ccctgtcaac tgtagctcct gtcctgggcc cccgacagca tcaccctcga ggccggtgct 960 
tcatctcctc cagctccttt taagaacgaa cttgatgaaa acacagactt tacctacaag 1020 
cccggcagga gctcatggtc cacactcact cgctttgggg ctgacagcca ctttcccagg 1080 
ggagcctggg gcctcccctc gactctcacc agggccttcg acccctccag gagcccccac 1140 
tctacctcta gcttccccag gggctcctca gccacctcct gtgactccag agcgctcgtt 1200 
ctcagcctct ggggcccaga tagtgtccag gtggcctcct ctgcctggca ccctcctgac 1260 
ggaagcttca gcactttcca tgatggaccc cagcccctcg aagaccccca tcaccctcct 1320 
cgggcctcgc gtgctttctc ccaccacctc tagactctcc acagcccttg cagccaccac 1380 
ccaccctggc ccccagcagc ccccagtggg ggcttctcgg ggggaagagt ccaccatgta 1440 
aggaggtcac tgtgtccggg agactctgga gagaggacct ctgccagtgg cccagggtgt 1500 
gtgcagggca gctccaagga tgaacctggt ggggatgcct gggctccctc ctgcaggggc 1560 
cctggtgagg atggaagacc cccaaggctg gatgtaacct tgttcccaag aagtgtttgg 1620 
aatgtgctgt aagaatggag gaagtcgttt ccactgtcag catcctccct* ggaccgcgtg 1680 
gctggctcat cttttgagaa gggttgggac tgccaagttc tcctggagga agagttgcgt 1740 
ccggctggga ttccactcac tgggactgta ccgccaggtg tcatgcgtct ctctgaggtt 1800 
tcctgattaa aggttgtctc ggtttcaaaa aaaaaaaaa 1839 

<210> 38 

<211> 1232 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc„feature 

<223> Incyte ID Noi 2457335CB1 

<400> 38 

gggcagcctg cgcctgggta ccgaggctgc 
ccgcagggcg gtctccaaga ctctggcgct 
gctcggaaag gacgcatctc tgcgccggat 
atcaaatatg atttattatc tggttgtagg 
ttacaagaca gtaacatcag accaagccaa 
aaaaacaaaa gcagagatac atccatttca 
gaaagcaagt tcagaagccc cagaagaact 
agaaagtccc agtgctacag ttgtggtcat 
ggaggctgct ccggagacca cagcagtcag 
agcggcgagg gaaaccacgg aagtaaaccc 
cctggatgaa gctgtcacca tcgataatga 
tgaatatgct gaactagaag aagaaaattc 
tgatttacag gaggaagcca gtgttggctc 
gccggtagac atttcagcaa caaatgccat 



tgcgcggcgg acagcgggcg cgatgtatct 60 

gccgctgagg gcgcccccca accccgcgcc 120 

gtcatctaac agattccctg gatcatctgg 180 

cgtcacagtc agtgctggtg gatattatgc 240 

acacacagaa cataaaacaa atttgaaaga 300 

aggtgaaaag gagaatgttg cggaaactga 360 

tatagtggaa gctgaggtgg tagatgctga 420 

aaaagaggca tctgcctgtc caggtcacgt 480 

tgctgaaacc gggccagagg tcacagatgc 540 

tgaaacgacc ccagaggtta caaatgctgc 600 

taaagataca acaaagaacg aaacctctga 660 

tccagctgag tcagagtcct ctgctggaga 720 

tgaggctgct tcggctcaag gcaatctcca 780 

agggtgtctg ataagtgctt tggtgttttt 840 
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agtacactta gtttaaaaaa aaaaagactt 

gattttgggc atccactgat agattttagg 

actactcttc tacggtctga gatcagaata 

catcttgggt ttttttttct agacactttt 

tttatattaa agaaatggag ctatccataa 

ttcagcttac tttatggtat gatatgtgag 

tgaaaacgtg taaaaaaaaa aaaggggggc 

<210> 39 

<211> 3250 

<212> DNA 

<213> Homo sapiens 



attttctaga aaacgttaag gggtctggaa 900 

actgagagac ttgagattgg tacatttctt 960 

tgacattgca gtaaagagct taaatagttt 1020 

cttttatccc agtcatctct gaatttacat 1080 

aacctgtatt tactttaatt tcaccatgct 1140 

gctaacatat agttggggtc caataaaatt 1200 

eg 1232 



<220> 

<221> misc_feature 

<223> Incyte ID No: 2267802CB1 



<400> 39 

ggcccccgcc caggtgtctc cctttgggaa 
tggtggtccc gcggacccct cgtccctccg 
tggggagaac gccccggagt ccagctcctc 
tccacaggtg ccgcctccgg aggaagaatc 
ccccaagaaa ctctgtgggt atttaagtaa 
gaaatcccgc tggttcttct acgacgaaag 
tcaggatgcc aatcccttgg acagcatcga 
ggacgctgag gaggggatct tcgaaatcaa 
cgccaccaag caagcgatgc tgtactggct 
ccacaacagc ccgccggcac ctcctgccac 
cgtcctgcac ctcgagctag ggcaagaaga 
gaaaacaccc cctgggctag tgggcgtggc 
gaatatttcc ctcaagcacc tggggactga 
caacaagcag gcccagggaa caggccatga 
ggagcctcag agggaggagc agccctcggc 
agaggattct ccaaagcctg cacccaagcc 
caagcgccag aacaacacct tcccattctt 
ccaggagaaa gtggcagcct tggagcaaca 
tcagaaggag ctagtgaaga tcctgcacaa 
ggcgtccagc gcatacctgg cggcggctga 
caaagtgcgg cagatcgcgg agctgggccg 
gagcctggcg cacacagcga gcctgcggga 
gcagctgctt atggacaaga accacgccga 
ggtcacccag gacttcacgc acccccctga 
cagggacttc ctgagccagc aggggaagat 
ccggacccag aactgcttcc tcaactccga 
ggtggctgag aaggagaagg cccttctgac 
ccaggtggaa agcaagtacc tggccggtct 
agccagcgag tgctcagagc tgctgaggca 
tggggaggcc tcatctgaca gcatcgagct 
cttcctgacg gtgcccgact atgaggtgga 
attggagtca cgatcccacc acctgctggg 
gcgctgggct gccctgggcg atcttgtgcc 
aggagtaccc cgtgaacacc ggcctcgtgt 
gcacctgcac actccaggct gctaccagga 
ccctgctgcc cgccagattg agctggacct 
cacctgcccc acctccagct tccccgacaa 
gcagaacccc accatcggct actgccaggg 



gctgcccgcc gagtctccga gatttgtccc 60 
cagtctccgg ctggcagcga tggagggcgc 120 
tgcccctggg tccgaagagt ctgccaggga 180 
gggggactgc gcccggtccc tggaggcggt 240 
gttcggcggc aaagggccca tccggggctg 300 
gaaatgtcag ctgtattact cgcggaccgc 360 
cctctccagt gcagtgtttg actgtaaggc 420 
gactcccagc cgggttatta ccctgaaggc 480 
gcagcagctg cagatgaagc gctgggaatt 540 
ccctgatgcc gccctggctg ggaatgggcc 600 
ggcagagctg gaggagttcc tgtgccctgt 660 
agctgccttg cagcccttcc ctgcccttca 720 
aatacagaac acaatgcaca acatccgtgg 780 
acctccaggg gaagattcta cacagagtgg 840 
ctctgacgcc agcaccccag tgagagagcc 900 
ttctctgacc atcagtttcg ctcagaaagc 960 
ttctgaagga atcacacgga accgaactgc 1020 
ggttctgatg ctcaccaagg agttaaagtc 1080 
ggcactggag gccgcccagc aggagaagcg 1140 
ggacaaggac cggctggagc tggtgcggca 1200 
gcgggtggag gccctggagc aggagcggga 1260 
gcagcaggtg caggagctac agcagcacgt 1320 
gcagcaggtc atctgcaagc tctctgagaa 1380 
ccagtctcct ttgcgccccg acgctgccaa 1440 
agagcacctg aaggatgaca tggaagctta 1500 
gatccaccag gtcacaaaga tctggagaaa 1560 
gaagtgcgcc tacctccaag ccagaaactg 1620 
gagaaggctg caggaggccc tgggggacga 1680 
gcttgtccag gaggcactgc agtgggaagc 1740 
gagccccatc agtaagtatg atgagtacgg 1800 
agacctgaag ctgctggcca agatccaggc 1860 
cctcgaggct gtggatcggc cgctgaggga 1920 
ctcagccgag ctcaagcagc tactgcgggc 1980 
ctggaggtgg ctggtccacc tccgtgtcca 2040 
actgctgagc cggggccagg cccgcgagca 2100 
gaaccggacc ttccccaaca acaaacactt 2160 
gctccgccgg gtgctgctgg ccttctcctg 2220 
cctgaacagg ctggcggcca ttgccctgct 2280 
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ggtcctagag gaggaggaga gcgccttctg gtgcctggtg gccattgtgg agaccatcat 2340 
gcccgctgat tactactgca acacgctgac ggcatcccag gtggaccagc gggtgctcca 2400 
ggacctgctc tcggagaagc tgcccaggct gatggcccat ctggggcagc accacgtgga 2460 
tctctccctc gtcaccttca actggttcct cgtggtcttt gcggacagtc tcattagcaa 2520 
catcctcctt cgggtctggg atgccttcct gtacgagggg acgaaggtgg tgtttcgcta 2580 
tgccttggcc attttcaagt acaacgagaa ggagatcttg aggctacaga atggcctgga 2640 
aatctaccag tacctgcgct tcttcaccaa gaccatctcc aacagccgga agctgatgaa 2700 
catcgccttc aatgacatga accccttccg catgaaacag ctgcggcagc tgcgcatggt 2760 
ccaccgggag cggctggagg ctgagctgcg ggagctggag cagcttaagg cagagtacct 2820 
ggagaggcgg gcatcccggc gcagagctgt gtccgagggc tgtgccagcg aggacgaggt 2880 
ggagggggaa gcctgacttg gccacctccc ctccccacag ccttcctcac ccttggctgg 2940 
cagacccact ggaggtcagg cacggaccag tggcccagcc ctgggtgtcc catcaccatg 3000 
tgaccttgga catgtccctt cccctctctg gccctcagtt tccccactgg gacattgtgt 3060 
gctgcaaagc cattggttgg gctacttctt cataggcact tacttaccca gggatgccac 3120 
cctttcgtca cctcttccac agagcacttt ggcatgtaaa caagcaagag cactgcctct 3180 
atagggtaac ctggaacatt ctctaggtta tatcaatata aaacaatgta aatggtggaa 3240 
aaaaaaaaaa 3250 

<210> 40 

<211> 3621 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> miscjeature 

<223> Incyte ID No: 3212060CB1 

<400> 40 

aaggcgagta gcatgtgcgg gagactcacg ttgccggcga agtgggagag agaaaagtgg 60 
taacctgggg ctgggggccg gcgcggcgga gctcggagta gtagagcgga gtgaagacac 120 
gggggaggat agagactggc attcctttgg gccgggggat tggcgggagt cgtgctgggt 180 
gctctcgccg tgttgaggtc ccagtgaggg gaaggagaag cggaagaggg tctctagtcg 240 
gggcctaggg caaagggact acaaaaagga tgcagatgac tatagaaatg aggacgacga 300 
ggagatgctg tggaggagca gtagaggtga gaagatgatg caaagaaact gtgtcagtga 360 
ggaactgtat agagggtcat agaggtgagg tggcggagag aaactaacta acggaccata 420 
gaggtggggg agccattgta gaaggacgtg gacgcgaaag ggtcgtgtag atgggcatat 480 
gtgtgaagca gcaacgtaga ggggctgaag aggagaaatt catggagaga aagaatgcac 540 
ctagagtgag ctctgcagag tgctgcgtgg gatatcccta gagtttggtc tagtgaaggc 600 
acgctaacca ggcacctaag gcatttcaag tagtgacttc ccacatttgg ctaggaatgt 660 
gggtcctcct ccgaagtggg taccccctcc gtatcttgtt acccctgcgt ggggagtgga 720 
tgggtcggag gggcctgccc cgaaacttgg ccccaggccc tcctcgcaga cgttacagga 780 
aggagactct ccaagccttg gatatgccag tgttgcctgt aactgcaact gaaatccgcc 840 
agtatttgcg ggggcatggg atccccttcc aggatggtca cagttgcctg cgggcactga 900 
gcccctttgc agagtcttca cagctcaaag gccagactgg tgttaccact tccttcagcc 960 
tcttcattga caagaccaca ggccactttc tctgcatgac cagcctagca gaagggagct 1020 
gggaagactt ccaggccagc gtggaggggc gaggggatgg ggccagggag gggtttctgc 1080 
ttagcaaggc accagaattt gaggacagcg aggaggtccg gaggatctgg aaccgagcaa 1140 
tacctctctg ggagctgcct gatcaggagg aggttcagct ggctgataca atgtttggcc 1200 
ttaccaaggt tacagatgac acactcaagc gtttcagtgt gcgatatctg cgacctgctc 1260 
gcagtcttgt cttcccttgg ttctcccctg ggggttcagg attacgaggc ctgaagctcc 1320 
tagaggctaa atgccagggg gatggagtga gctacgagga aaccactatt ccccgaccca 1380 
gcgcctacca caatctgttt ggattaccac tgattagtcg tcgagatgct gaggtggtac 1440 
tgacgagtcg tgagcttgac agcctggcct tgaaccagtc cacggggctg cctaccctta 1500 
ctctaccccg aggaacgacc tgcttacccc ctgccttact cccttacctg gaacagttcc 1560 
ggcggattgt attctggttg ggggatgacc ttcggtcctg ggaagccgcc aagttgtttg 1620 
cacgaaaact gaaccccaaa cgatgcttct tggtgcgacc aggagaccag caaccccgtc 1680 
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ccctggaggc cctgaacgga ggcttcaatc tttctcgtat tcttcgtacc gccctgcctg 1740 
cctggcacaa gtccatcgta tctttccggc agcttcggga ggaggtgcta ggagaactgt 1800 
cgaatgtgga gcaagcagct ggcctccgct ggagccgctt tccagacctc aatcgtatct 1860 
tgaagggaca tcgaaagggc gagctgacgg tcttcacagg gccaacaggc agtggaaaga 1920 
cgacattcat cagtgagtat gccctggatt tgtgttccca gggggtgaac acactgtggg 1980 
gtagcttcga gatcagcaat gtgagactag cccgggtcat gctgacacag tttgccgagg 2040 
ggcggctgga agatcaactg gacaaatatg atcactgggc tgaccgcttt gaggacctgc 2100 
ccctctattt catgactttc catggacagc aaagcatcag gactgtaata gatacaatgc 2160 
aacatgcagt ctacgtctat gacatttgtc atgtgatcat cgacaacctg cagttcatga 2220 
tgggtcacga gcagctgtcc acagacagga tcgcagctca agactacatc atcggggtct 2280 
ttcggaagtt tgcaacagac aataactgcc atgtgacact ggtcattcac ccccggaaag 2340 
aggatgatga caaggaactg cagacagcgt ccatttttgg ctcagccaaa gcaagccagg 2400 
aagcagacaa tgttctgatc ctgcaggaca ggaagctggt aaccgggcca gggaaacggt 2460 
atctgcaggt gtccaagaac cgctttgatg gagatgtagg tgtcttcccg cttgagttca 2520 
acaagaactc cctcaccttc tccattccac caaagaacaa ggcccggctc aagaagatca 2580 
aggatgacac tggaccagtg gccaaaaagc cctcttctgg caaaaagggg gctacgacac 2640 
agaactctga gatttgctca ggccaggccc ccactcccga ccagccagac acctccaagc 2700 
gttcaaagtg aaggccgtgc agagctggtc actgaaatga gcctgatagg ataggctgga 2760 
gcataaaact ctgcaagggc tcctctatcc tgtggtcctg agctgtgtgc ccttctcagt 2820 
ctgaggggcc taacctagag caggtttcca tagtgagaaa attcaatgta gcagactact 2880 
gagaaactac tgtgttgctc aggctttgtt tgaggtcctg tatatacagc actgaaaaga 2940 
gagataaagt ccctgcctgc atgcattctg gcggaagaga caagcaagca atgaacaaat 3000 
tagcagaaaa cctagtttta gtgaaaaatg ctgtaaagaa aatagaaatg cgatagagtg 3060 
ctggcaggct agtgtagata agtggtctga aaaggtgtct ctgagccgag ggcatgtgag 3120 
ctggggccta aacaactaga aggagagagc cacgtgaaca tccggcgaag gggacccagg 3180 
cagagagaaa aggaaatcca agccctgagg taggaatgag caggtcagat tcaaggcagt 3240 
gaggtcaggc cgcatgaacc tggaggggaa tcggggactt catgcgaaac tccagcctag 3300 
gctttcaaag tcaaagggtg atacagtggg taccaagctc ctctgctccc cactttgtag 3360 
agcctagcat gaggtggcat gtactagaat tggatcctag gtgcttagcc ctgcaatatc 3420 
agggcctcac tggtgggagc tgcctcgggc tgggttgctt ggtcatagag ccatagaagg 3480 
aagctgtcag cccggagtgc ctgccaccta gacactgatg ccattgtgtg ctgcctcaag 3540 
actgctggag tcaggacatt ttatagagcc ttttccagtt ttactaaaaa atttttccat 3600 
tgaaaaaaaa aaaaaaaaaa a 3621 

<210> 41 

<211> 1693 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> nviscjeature 

<223> Incyte ID No: 3121069CB1 



<400> 41 

catgaatgta ctgcaaggga acacatttgt 
ctccttaaat tatttgtggc aatagtgatc 
aagacaccga aagaaagaac attggagcta 
tttacctatt cactctcctc cttaaatttt 
gaaactcaga ttatcatgag aatctttcta 
acttgccaag acatcacagt tcttatcagg 
tttcattcac cttgtcagca ctttaacttc 
gaatataaca ctacctgtca tctaaaaaac 
gagccaagca aggagaaatc gataaactac 
tgtatacaca tttctttgca cttggagatg 
atcacttggt atattttagt tctattagtt 
aaaatacttg aaggccagag aagagtgcaa 



gtcatgtgaa gagacatgac aaaaacagcc 60 
acattcattt taattttgcc ggaatatttc 120 
tcatgtctgg aagtgtgttt gcaatctaat 180 
tcttttgtga cttttctgca accagtaagg 240 
aatccctcca attttcgtaa cttcaccagg 300 
agaggatcaa tggaagtgaa agcaaatgat 360 
agtgtagctc ctctggttga ccacttggag 420 
cacactggaa gatcaacaat catggaggat 480 
acttgtagaa tcatggaata cccgaatgat 540 
gatataaaaa atatcacttg ttccatgaag 600 
tttatatttt tgatcatcct cactatccgc 660 
aagtggcaga gtcatagaga caaacctaca 720 
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tctgttctct taagaggaag tgattcggag 
tcagagacca cgcagaggct gcctttggat 
gaactataag ttacttccac agtgcatcag 
aagttggacc gagccctttg aagaatactc 
aaaatgtttg tgtccagctg aggatgcaca 
gatgaaaact agcttaagag cattcattcg 
gcaagaagcc attatgagtc atggaaaaaa 
aattcccact ctctctcttc ttaaaaaaaa 
agtttttaat tttgaggaac caaaaacagg 
gccacattgt agatgctgat ttgataattg 
acagattatt taaaagggaa gagggcctgg 
gaatcttgct gccttttagc accaggatgt 
gtggctcatg cctgtaatcc cagcactttg 
agggatcgag accatcctgg ccaacatggt 
ttagctgggt gtggtggtgc gcgcctgtag 
aatcacttga acctggcagg cagagtttgc 
gcctagacaa gca 

<210> 42 

<211> 2289 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 3280626CB1 



aaactgagag cattgaatgt gcaggttctt 780 
caagtccagg aagtgcttcc cccaattcca 840 
tgagatcaat atacacgaat atccccgggc 900 
agaagtttat tttgtgaatg agtagactgg 960 
gttggaaagc aggaggaatg ctgactggtt 1020 
ctccatgaga tcaagggaac aagagtgttt 1080 
agatgatgaa acccatggaa acagcaagag 1140 
tctatcatta tacagcacag agtggagcca 1200 
atcaaatatg aaaacccttt cttttattgg 1260 
tttcctatgc agatagatta tttttatttc 1320 
ttgtttattt atatgtttgt ttgcatttat 1380 
ttttaaaaaa attcaaagag gccaggcgca 1440 
ggattctgag gtgggaggat c^tgaggtca 1500 
gaaaccctgt ctctactaaa aacacaaaaa 1560 
tcccagctcc tcaggaggct gaggcaggag 1620 
agtgaaccaa gatcacgcca ctgcattaca 1680 

1693 



<400> 42 

ggccgctgta acctcttcgg tccgcgacga 
gcgagcccgc tcccctgagt aagagtcagc 
aatctgcaaa atggctgata atttggatga 
cgaagacaaa gcagagttgg aaagtgatcc 
agcgaagctt tctgaaaaca gtaagatact 
aaatagtcaa cagaccaggg gttccttagg 
agaagactat gaacggaaga aacataaatt 
ttatcttact caggaaaggt tgaaacttga 
gggtaaggaa gaatccagtg aaaagttcag 
tcagagaaat aaaaaaccta ttggtcaagt 
atcttgtgaa aattcagagg gtcctagaaa 
agaacttctg aaccaaagac gactagagga 
cgaattaagg aatagaagaa ttattaaaaa 
aaaacatcaa aggtttgcaa gcaaggctgg 
tgaggatcgt gtttttgata gacggtatca 
agaaatggat gagaggttta gatatgaaag 
tacaaatgac aggatgcaca ggaacaaacg 
ggatgttata gaacagtcaa acataagaat 
caatgaaaca tccaaatctg ctaatcaaga 
tggaggtgaa gatcgagaac ttattcagag 
ggaacaaatg gctgagcaac agaggaacaa 
tgcagcgtct ggagcacaag accctgagaa 
ggcaccaaga cactttgaag agatgatacc 
acctctccot cctttatctg ccccatctgt 
ttctcaaaat gaagatttgc gcagtggact 
caggattgca cctctgcctc cacctcccct 
tccttatgat gatgcatact atttttatgg 
ttattatggt tcaggaatga tgggcgtaca 



tcctctagag cactgtgtgt ctccccggac 60 
cagccgcgga tggggagcgt gagtggcgag 120 
atttattgaa gagcaaaaag ccagattggc 180 
accttacatg gaaatgaagg gaaagttgtc 240 
gatctctatg gctaaggaaa acataccacc 300 
aattgattat ggattaagtt taccacttgg 360 
aaaagaagaa ttgcggcaag attacagacg 420 
acgtaacaaa gaatacaatc agtttctcag 480 
gcaggtggaa aagagtactg agcccaagag 540 
taagcctgat ctaacttcac aaatacagac 600 
agatgtctta actccttcag aggcatatga 660 
ggacagatac cgacaactag atgatgaaat 720 
agcaaatgaa gaagtgggca tttccaacct 780 
cattccagat agaagatttc acagatttaa 840 
tagaccagac caagatcctg aagtaagtga 900 
tgattttgat agaagacttt cgagagtgta 960 
agggaatatg cctcctatgg aacatgatgg 1020 
ttcatctgct gaaaataaaa gtgctccaga 1080 
tacctgtagt ccttttgcag ggatgctctt 1140 
aaggaaagag aaatacagac tagaactgtt 1200 
gagacgagaa aaagatttag aactcagggt 1260 
atcgcctgat agactaaagc agtttagtgt 1320 
acctgaaaga cccagaatag ctttccagac 1380 
cccacccatc ccatcagttc atcctgttcc 1440 
cagcagcgcc cttggtgaaa tggtgtctcc 1500 
actaccacct ttggctacta actatcgaac 1560 
gtccaggaat actttcgatc ccagtcttgc 1620 
gcctgcagct tatgttagtg ctcctgtcac 1680 
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ccaccaacta gcacaacctg ttgtagtctc 
gtgaacttgg ctcactgcaa cctccgcctc 
tcctgagaaa ctgggattaa aggcgtgcac 
agtagagaca gggtttcacc atgttggcca 
tgcctgcctc agcctcccaa agtcctggga 
gaagattaca agtgatcaag tgataaattc 
ttccaaacag tcacttcagt cttaccaaga 
aaaaaagagg ggggcggcga atattgagct 
ctgcaggcga gagggaggga atctatatca 
gcggacccaa ttcgctaagg gagcgataag 
ctgcgtcac 

<210> 43 

<211> 1304 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 484404CB1 



accctgtcac ccaggctgga gtacaatgtt 1740 
ctgagttcaa gtgattcttc tggctcagcc 1800 
cactatgccc ggctaagttt ttgtatcttt 1860 
ggctggtctc gaactcctga ccttgtgatc 1920 
ttacaggaat acggttggac agaatgaact 1980 
aggattgatt tttgaagata aaccgaaacc 2040 
ggctttgcag cagcagattc gggaaaaaaa 2100 
cgtgaccgcg gaataaattc gggcgcgaac 2160 
agatatcgat accgggaccc gaaggggggc 2220 
gcggcaaggc gcggtaaacg cgggtggaac 2280 

2289 



<400> 43 

ctcgcccacg cgtccggggc aggtaacagc 
cctgagaata aagccgatag ccaccctcct 
gctctcagtt ctctccacat ttccatagag 
catttatagg cataaaatcc actgtctgcc 
acatttttta caccaatgta ccaaaaaggc 
ggccagagtc gtggcttaca gaagagacga 
aagaattccg agagaacgag cacctcctcg 
atggacaaga gacgatcatt ctgcaagcag 
ctttagaaga aaaagtttct actcttccca 
ggacaatact tttttcagag aatcacctgt 
tggttccagt gtcagtagca gaagctactc 
tcagtctcaa catagaaagt ccgtgcgtcc 
aaatcctgaa agagataaag agaggcctgt 
accctcaagt ggttcagcag tttcttcatc 
tgaaaaggaa cttgctgagg ctgcaagcaa 
tgaaagtaac ttgcctgaaa tttctgagta 
tgaccagcca gaggaacctg agtcaaacac 
tcagctaacc actcgctcta aagcaatagc 
ccgacaagac tgtgaaactt tcgggatggt 
attagaaaag tctatacagt ttgcattgag 
tgttgaagaa ctcaagcatt tcattgcaga 
gcctttttag atttttctgc tcaggctaaa 

<210> 44 

<211> 4850 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 2830063CB1 



tgcatcattg accgcacagc gccatctctc 60 
ccggctccga gcctgcttct gccacacctc 120 
accgtgtggt ttttgttcac ccgggccccc 180 
agcctccctt ccctcccacc tttttgtttt 240 
ggacggctgc atttacgggg tctcccggag 300 
aatgtggtct gagggacgat atgaatatga 360 
aagtcatccc agtgatgaat ctggttatag 420 
gcaacctgaa tacagggaca tgagagatgg 480 
ttatgcgaga gagcggtctc cttataaaag 540 
tggccgaaag gattctccac acagcagatc 600 
tccagaaagg agcaaatcat actctttcca 660 
tggtgcctcc tacaaacggc agaatgaagg 720 
ccagtctttg aaaacatcaa gagatacttc 780 
aaaggtgtta gacaaaccca gtaggctaac 840 
gtgggctgct gaaaagctag agaaatcaga 900 
tgaggcggga tccacagcac cattgtttac 960 
aacacatggg atagaattat ttgaagatag 1020 
atcaaaaacc aaagagattg aacaggttta 1080 
ggtgaaaatg ctgattgaaa aagatccttc 1140 
gcagaattta catgaaatag gtgagcggtg 1200 
gtatgatact tccactcaag attttggaga 1260 
aaaaaaaaaa aagg 1304 



<400> 44 

gtgcagcggc tgagatcacg tggtgcgccg ggaagccacc ccgcctctcc gaggcctccc 60 
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tgccccgccc cgtcacgccc ctttcccggc 
gggcggtcgg gggagtcata ctccatgggt 
gaatttaaag aatgatggct tcattccagc 
tagttgcaga ggagggtcgt acagcaagaa 
gcaaagatga tgatggaaaa cctaaatgtc 
aaggcactca taaaactact aaacagagta 
cgactggaga taaacacttt gataaaagtc 
atctaagagc tcgatactgg gcatttcttt 
tctatgtaac ttgcgaatca gatcagagtg 
tggataacta tgtaagagat ttcaaagcat 
tagagaagac agatgctcaa agcagaccaa 
ctccgggacg ccatgtgatt ccaagtccat 
ctcgacgaag cttaaatttt ggaggttcaa 
ccacaggtgt cagttgggct gacaaggtaa 
cagaaataac acccgcccag tcttgcccac 
atgaacggaa agatgctgaa ggatgggaga 
gatcaacagc agtgatgcca aaagtttcat 
acagtgataa agaaaatgta tgtcttttac 
ttggagatgg aacttctaat actatagaat 
accatcctct tgccgaaaaa acccagttca 
ctggcagtat tcgagacaat tatgttcgaa 
cagagtgtgt ttcagttatg ctgcaagctg 
aatttccagc agagaaagca aggatagaaa 
ccatggcaga agtccttgct aaaaaagaag 
aagaagccat tgctagtgct attgctgaag 
aagaaaacaa tgatattaac attgaaactg 
gcagtgggag tgtttctttc tgtggtatgt 
atgaagctcg tgagtcttgg cgccaaaata 
ctgctagacc tccagggcat ggaattcaca 
aaagaacaat tgcagaatct aagaagaaac 
taagggaaaa gttacgcgaa gagaaaacat 
aggatgtccg gaagtggaag gaagaattgc 
aattacttca tgctgagttt aagcgagaag 
aagaagaaga agctaaggta aatgaaattg 
aacgtcatga tgttttatca aaattgaagg 
aagagcgtca gagaagacag gaagaaaagc 
agagagctct agaggcagag cggcaggccc 
aacaagaagc ccgaattgaa caacagaggc 
cccgggaaag agctagagac agggaagaac 
aagctatgga agagttacag aaaaaaattc 
acatggaaca gattgaacaa agaaaagaaa 
caaatactga ttatgccccc aaactgaccc 
gcaatgtcct gatctcttca gaggtatatc 
agcaagccgt gagagagaat accagcatcc 
agcatctttc cttgaagaag tacattattg 
aagctttgaa agatggagaa gagcggcaaa 
cccggatgaa cttcagggct aaggaatatg 
ctgattcacc ttataaagca aagcttcagc 
aagttcaaga cagtggctca tgggcaaaca 
gagagatcac tagaatactg gaaaaagaga 
ctggtggatt aacagccctt gaacacatcc 
acacagtttt aagaattcct cctaagtctc 
cctgcaataa ctgttcagaa aactgcagtg 
taatggacct cctgatacac cagttgacgg 
tggggagaaa tacaaataaa caagtttttg 
gtgctgtggt tttgggctgc ctgattgcca 



gggacgcttt gagccgcccc gaactaagca 120 
ttatgtgata aatgatcttg gcatagaaga 180 
gctccaatag tcatgacaaa gtaaggagaa 240 
acctaatagc ttggagtgtt ccactagaaa 300 
aaactggtgg aaaatctaag aggaccattc 360 
ctgcagtgga ctgtaaaata acatcgtcta 420 
ccactaaaac aaggcaccct cggaaaattg 480 
ttgataatct tcgccgagca gtagatgaaa 540 
tggtcgaatg taaggaggtg ctaatgatgc 600 
tgattgactg gattcagctt caggaaaagc 660 
catcattggc atgggaagta aagaagatgt 720 
caacagatag aataaatgta acatcaaatg 780 
ctggcacagt gccagctcct cgtctggctc 840 
aggctcatca tacaggctct actgcttctt 900 
caatgacagt gcagaaggcc tcacgcaaaa 960 
ccgttcagag aggaaggcct attcgttctc 1020 
tggcaacaga agccacaaga tcaaaggatg 1080 
ctgatgaaag catacagaaa ggtcaatttg 1140 
ctcatcccaa agactcatta cactcttgtg 1200 
cagtgagtac attggatgat gtgaagaatt 1260 
cttctgaaat atctgctgtc cacattgata 1320 
gtacacctcc tttacaagta aatgaagaaa 1380 
atgaaatgga cccttcagat atttcaaatt 1440 
agctagcaga tcgtctagaa aaggccaatg 1500 
aagaacagtt aactagagaa attgaagctg 1560 
acaacgacag tgatttttct gccagcatgg 1620 
ccatggactg gaacgatgtc cttgcagatt 1680 
catcctgggg ggacattgta gaagaagaac 1740 
tgcatgaaaa actttcttca ccctctcgta 1800 
atgaagaaaa acaaatgaaa gcacagcagc 1860 
tgaagcttca gaaattgtta gaaagggaga 1920 
tagatcaacg acgcaggatg atggaagaaa 1980 
tgcagttaca agcaattgtg aaaaaagcac 2040 
cctttataaa tacccttgaa gcccagaata 2100 
aatatgaaca gaggcttaat gagctacagg 2160 
aagcacgtga tgaagctgtg caggaacgca 2220 
gtgtagaaga attgttaatg aagaggaaag 2280 
aagaaaagga aaaagcccgt gaggatgcag 2340 
gattggcagc actcacagct gctcaacaag 2400 
agctcaagca tgatgaaagt attcgaaggc 2460 
aagctgctga gctaagcagt gggcgacatg 2520 
cttatgaaag aaagaagcag tgttctctct 2580 
tttttagcca tgttaaaggg agaaaacacc 2640 
aggggcgtga actgtcagat gaagaagtgg 2700 
acattgtggt tgaaagtaca gctccagcag 2760 
aaaataaaaa aaaagccaaa aagataaaag 2820 
agagtttaat ggaaaccaaa aattctggct 2880 
gattagccaa agatcttcta aaacaagtac 2940 
ataaagtgtc tgctttggat cggaccctag 3000 
atgtggcaga tcagattgca tttcaagctg 3060 
ttcaagcagt agtcccagcc acaaatgtga 3120 
tctgcaatgc aatcaatgtt tacaacctca 3180 
atgttctgtt tagtaacaag attaccttct 3240 
tttatgttcc agatgaaaat aatactattt 3300 
aaggcttgac aactggactt ctcaaagtca 3360 
atcgaccaga tggaaactgc cagccagcta 3420 
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ccccaaaaat accaacacag gaaatgaaaa 
atcgagttca ggaccttatc agctacgtgg 
cctgcttcct ctcggtgcaa ggcccagtgg 
agcatgccgc aggactctta catgcaatgt 
catacagcat atttgacaat aatcgccagg 
caaccgacct ggctggagtt cttcatatgc 
tggaccccag cactgccagt cccaaggaga 
ttcagagttt acgtttcttc aacagctttg 
ttgtaggggc agagggcttg tcccttgcat 
actgcagcca agtctcctgt gaaagcctcc 
tcactgtcaa ccacccagat aaccaggtga 
tgcagaagct ctgccagttg cccttccagt 
tgttcccttc acttatcgct gcttgttaca 
aagagatgag ctgtgtttta ctggccactt 
aagcggaaaa ccagccttac caacccaaag 
agctggctaa cagatttcct cagcaggcct 
aagagaaaaa ataaatgttt tggttgattc 
attgtccaaa caaacattct aattgttcct 
tcccacactg tagatatggc atgtacttta 
tgttctactt gctaattttg caagttgagt 
tggtaatttc catcttatga taatatatac 
ataaaattag caagtctgtt tgttcttgta 
acttttcatg ttaaggaaat acgaatctga 
atcttaaata aagacttaat taaagtttaa 



acaaaacctc acaaggtgat ccttttaaca 3480 
tgaacatggg tctgattgac aaactgtgtg 3540 
atgagaatcc caagatggcc atatttctgc 3600 
gtacactgtg ctttgctgtc actggaaggt 3660 
atcccacagg gctgacagct gctcttcagg 3720 
tctactgtgt cctcttccat ggcaccatct 3780 
attacactca aaataccatc caagtggcca 3840 
cagctcttca tctgcctgct tttcagtcta 3900 
tccggcacat ggccagctcc ctgctgggcc 3960 
ttcatgaggt catcgtctgt gtgggctact 4020 
tcgtgcagtc cggccgccac cccacagtgc 4080 
atttcagtga cccacggctg atcaaagtac 4140 
acaaccatca gaacaagatc attctggagc 4200 
tcattcagga tttggcacag actccaggtc 4260 
ggaaatgcct tggttcccaa gactatcttg 4320 
gggaagaagc tcgacagttt ttcttgaaaa 4380 
tgtatttgag tacccttgtt aatattttaa 4440 
taagaactca ttttcccatg tttatactct 4500 
cactatttat aatgactgta gatacttgaa 4560 
ttatttcatt tatgcagagt atcttggagt 4620 
tttgcatttg tgatatgggt gaaaggagac 4680 
ataaagtaac ttattctgtt ttcattgttg 4740 
aagaaaaatg ttaactccag ctcttgaagt 4800 
caaaaaaaaa aaaaaaaaaa 4850 



<210> 45 

<211> 4350 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> miscjeatiire 

<223> Incyte ID No: 7506096CB1 

<400> 45 

atggaatcta gttcatcaga ctactataat 
aatgttgctt ccttaagaca tgaactgaag 
gaagagttat ccagtgttag tccaagtgaa 
tctgaaaagc taattttgga tgttcagcct 
tatgaaaacg tctgtaaaat atctggtagc 
aagatgtttt catcttctgc ccctgtggat 
aataaactta ggcaacagaa tgcttgtttg 
tttgaatcta ttcactttga attaacacag 
gctcaacagc aggcagccag tgtcccaatc 
gaggtttcag ctcaagataa agttttgaga 
aaaatggtaa ttgaaaagga acagagtttg 
aaggtggact tacttgaaca aaccaaacaa 
gcactatata atgccgaaga gctgagtaaa 
gaaaaactgg aaaaggttca agctgaagaa 
gaaaaagaaa ataaaaggct acaagaaagg 
ctgaaagaga aattaaggca gttaaaagaa 
atcatggcag tgaaaaattc agaagtcatg 
ttgaagctag agagtgagtt agagaacaaa 
atgaatgaaa accgagaatt aaaggtccgt 
tgtcaacaag aaattgaaag ttcaagggta 
cagttgccat taaaaagaga attatttggc 



aaagacaatg aagaggaaag tttgcttgca 60 
ataacagaat ggagtttgca gagtttaggg 120 
aattctgatt atgcccctaa tccttcaagg 180 
agccaccctg gacttttgaa ttattcacct 240 
agcactgatt ttcaaaaaaa gccaagagat 300 
caggagatta aaagccttcg agagaaacta 360 
gtcacacaga atcattcctt aatgactaaa 420 
tcaagagcaa aagtttctat gcttgagtct 480 
ttagaagaac agattataaa tttggaagca 540 
gaggcagaaa ataagctgga acagagccag 600 
caggagtcca aagaggaatg tataaaatta 660 
ggaaaaagag ctgaacgaca aaggaatgaa 720 
gctttccaac aatataaaaa aaaagtggct 780 
gaaatattag agagaaatct aactaactgt 840 
tgtggtctat ataaaagtga acttgaaatt 900 
gaaaataaca acggaaaaga aaaattaagg 960 
gcacaactaa ctgaatctag acaaagtatt 1020 
gacgaaatac ttagagacaa attttcttta 1080 
gttgcagcac agaatgagcg actagattta 1140 
gaactaagaa gtttggaaaa gattatatcc 1200 
tttaaatcat atctttctaa ataccagatg 1260 



66/69 



i 



wo 02/078420 



PCT/US02/09809 



agtagcttct caaacaagga agaccgttgc 
atttcggaat tgagaattaa gcttgcaata 
aacctgactg caaatcagtt atctcagagt 
agcaaattaa gtagtttaga aacagaacct 
agcgtaaaag atcaaaatca acatactatg 
cttgttactg gaatagaaga actacgtact 
gatttgaagg ttaacatggc tcacagaact 
ctagagaaag cttcaaactc cagcaaactg 
cttttaactc ttgagaaaca gctggaagaa 
aaaaatgcag aactagaaca ggagcttatg 
accaatatta atacagagca tgagaaaatt 
cacttggaac agcataaaga aatggaaaag 
gcattggaaa tttgtaagga agaacttgtc 
gaaaagtttg aaaaacagtt aaagaagaaa 
ctaaagataa aaaatcacag tcttcaagag 
actcttcagc aacagcagca aatgttacaa 
gatactcaaa ctaaacttga aaaacaggtg 
agggaaagtt cagctgaaaa gttgagaaaa 
gaagcagatt tgaaaaggca aaaagtgatt 
attgagatgg atcagtacaa agaagagctg 
aaacgagatg gagaaaataa agcaatgcac 
acaaagacag agctagaaaa gaaaacaaat 
agtactgaaa ctgaactaac agaagccttg 
caaaatgctc atggagaatt aaaaagtact 
ctacagaagg ctcaattatc attagaggaa 
gaacttagag aatgcaagat ggagattgaa 
caggcactta aagagagaaa ttgggaacta 
gatatgacta ttcgtgagca cagaggagaa 
actctggaga aatcagaatt ggaacttaaa 
.gacaaattac aaaatgctaa agaacagctt 
gaacaggaga taagtcaact gaaaaaagaa 
atggagagtg ttatgaaaga gcaagaacag 
gatttggggc aagaattgag gctgacccgg 
gcagaggctc gtcatcagca agtccaagca 
ctggaggata tgaagcaact ctctaaagag 
gaactggggg cttctaaagt acgtgaagct 
aagaaattgt cagcagaagt agaatctctc 
catcaagaga accatgcaaa gtggaagatt 
caactaaacg aacagttaga gaaggcaaaa 
agcaatttgc atcaacaagt ccaagatagg 
ttacttacta aagaatcaga attaaccaga 
gcagaagaca tcaagtttct gccagcccca 
gttcaagatc caaaatttgc taaatgtttt 
cgtcgctcta ttagtgccag tgatcttact 
gaagaattac tacaggactt aaagaaaatg 
agccataaga atctgactta cacccagcca 
gaagctgata gttctgagaa taatgacttt 
aacaaagaag taagactatt aaaaaagtct 
ggagaaaatg tgtaattcaa agaagatact 
tgtgctgttt acttattata tgtagctcat 
ataaatttta tatttcaata ttttaaaaga 
aagaaaattt aaaaaaaaaa aaaaaggggg 

<210> 46 
<211> 2959 
<212> DNA 



attggctgct gtgaggcaaa taaattggtg 1320 
aaagaggcag aaattcaaaa gcttcatgca 1380 
cttattactt gtaatgacag ccaagaaagt 1440 
gtaaagctag gtggtcatca agtagcagaa 1500 
aacaagcaat atgaaaaaga gaggcaaaga 1560 
aagctgatac aaatagaagc tgaaaattct 1620 
agtcagtttc agctgattca agaggagctg 1680" 
gaaagtgaaa tgacaaagaa atgttctcaa 1740 
aagatagttg cttattcctc tattgctgca 1800 
gaaaagaatg aaaagataag gagtctagaa 1860 
tgtttagcct ttgaaaaagc aaagaaaatt 1920 
cagattgaaa gagttaggca actagattca 1980 
ttgcatttga atcaattgga aggaaataag 2040 
tctgaagagg tatattgttt acagaaagag 2100 
acttctgagc aaaacgttat tctacagcat 2160 
caagagacaa ttagaaatgg agagctagaa 2220 
tcaaaactgg aacaagaact tcaaaaacaa 2280 
atggaggaga aatgtgaatc agctgcacat 2340 
gagcttactg gcactgccag gcaagtaaag 2400 
tctaaaatgg aaaaggaaat aatgcaccta 2460 
ctctctcaat tagatatgat cttagatcag 2520 
gctgtaaagg agttagaaaa gttacagcac 2580 
caaaaacggg aagtacttga gactgaacta 2640 
ttaagacaac tccaggaatt gagagatgta 2700 
aaatacacta ctataaagga tctcacagct 2760 
gacaaaaagc aggagctcct tgaaatggat 2820 
aagcaaagag cagctcaggt tacacatttg 2880 
atggaacaaa aaataattaa attagaaggt 2940 
gaatgtaaca aacagataga aagtctgaat 3000- 
cgagaaaaag agtttataat gctacaaaat 3060 
attgaaagaa cacaacaaag gatgaaagaa 3120 
tacattgcca ctcagtacaa ggaggccata 3180. 
gagcaggtgc agaactctca tacagaattg 3240 
cagagagaaa tagaaaggct ctctagtgaa 3300 
aaagatgctc atggaaacca tttagctgaa 3360 
catttagaag caagaatgca agcagaaatc 3420 
aaagaagctt atcatatgga gatgatttca 3480 
tctgctgact ctcaaaagtc ttctgttcag 3540 
ttggaattag aagaagctca ggatactgta 3600 
aatgaagtaa ttgaagctgc aaatgaagca 3660 
ttacaggcca aaatttctgg acatgaaaag 3720 
tttacatctc caacagaaat tatgcctgat 3780 
cacacatctt tttccaagtg tacaaaatta 3840 
ttcaaaattc atggtgatga agatctttct 3900 
caattagaac agccttcaac attagaagaa 3960 
gactcattta aacctctcac atataaccta 4020 
aacacgctta gtgggatgct aagatacata 4080 
tctatgcaaa caggtgctgg tttaaatcag 4140 
gatgtgttga aaaaatggaa tttttggtac 4200 
acttcataga agctgttatt ttgcttttga 4260 
aagcccttct aaaacttaat tatattttta 4320 

4350 
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<213> Hozno sapiens 

<220> 

<221> misc_featiire 

<223> Incyte ID No: 7505914CB1 

<400> 46 

ggggcgccat cttgtcttgt tcccgaagaa 
taccggaacg gcggcgacaa gggtgttccc 
gctatgcttc gaagctgtag tctgaatcaa 
agatagaaaa tcaagtatat tttaaaagaa 
cagcagtggc ccttgaacca gcaacaatgg 
agccagattg attgggctgc attggcccaa 
cagcaaagca tggtagaaca accaccagga 
atggaatctg gtccaaacaa tcatgggaat 
tggcaaccag aatggggaat gcatcagcaa 
atgccaccaa caccaggccc aatggacatt 
gacagtgggg aatttgcccc tgacaacagg 
ggtggaccac ccgataattt tgcagtgggg 
gctgcttttg gtccaccgca aggtggattt 
ggacctccag cacctcccca gaatcgaaga 
cgttcaccta ttgcacttcc tgtgaagcag 
aggactcttc ccgcttggat tcgcgaaggt 
aaattggaga aagaaagaat ggaacaacaa 
gccacagaag atgctgaagg aggggatggc 
agtgatgagg aagaagaaga cactgaaaat 
agaagtccat ccccagttcc tcaagaagag 
aaagagtatc aaatgatgtt gctgacaaaa 
acagatgaag aaatttatta cgtagccaaa 
ctgggtggtt atggatcagg agacagtgaa 
tctgacactg atgatgaaga attacggcat 
agaaaagaaa aagaacagca gctattacat 
acagaaaggg ttacaaaaga gatgaatgaa 
ctactagaag caagagaagc agacggtgat 
gaaaccacat cagttttaga accaaaaaaa 
agtaggtcgg gaagttctag tagtggtagt 
agtagtactg tctctagctc ttcatacagt 
cggtcttctt ctcctaaaag gaaaaagaga 
gctagacgta gcaggagtag aagctattct 
agggtaaaga ttagagatag aaggagatct 
cgaaatcgga gtccttcccg agagagacgt 
accaatcgtg ccagtcgcag taggagtcga 
aatcttagtg ggaacagtca taagcataaa 
gagaggagtc gaagtataga taaagatagg 
caggataaaa gaaaagagaa acaaaaaagg 
gatgatagat taaaaaggaa acgagaaagt 
tctgttaaaa tcataagaca tgattctaga 
agtaaaaaac attcaggctc tgattctagt 
agcaaagaaa agaaggctaa gaagcctaaa 
caaaggtctg gtaagaaggc aagccgcaaa 
actttttaaa gtattttgtc tgatttttaa 
aagtgtcctt tctctctctc tttaataaac 
cttaaatgtt agaaatccta tataatatta 
aatacatttt tatttttaaa ttttgtcttt 
cccgtcgtaa acgctgagga atgatgtggc 
catgagtttt aagggcttgt ctcaggatag 



gtagaagcat cgaaagcgtt ggagaggtgt 60 
gaactagagt ggggcataca taatcttgct 120 
cctaagtttt aaacagaagg tgaacctctg 180 
gggatgtggg atcaaggagg acagccttgg 240 
atgcagtcat tccagcacca acaggatcca 300 
gcttggattg cccaaagaga agcttcagga 360 
atgatgccaa atggacaaga tatgtctaca 420 
ttccaagggg attcaaactt caacagaatg 480 
cccccacacc cccctccaga tcagccatgg 540 
gttcctcctt ctgaagacag caacagtcag 600 
catatattta accagaacaa tcacaacttt 660 
ccagtgaacc agtttgacta tcagcatggg 720 
catcctcctt attggcaacc aggacctcca 780 
gaaaggccat catcattcag ggatcgtcag 840 
gagcctccac aaattgacgc agtaaaacgc 900 
cttgaaaaaa tggaacgtga aaagcagaag 960 
cgttcacaat tgtccaaaaa agaaaaaaag 1020 
cctcgtttac ctcagagaag taaatttgat 1080 
gttgaggctg caagtagtgg gaaagtcacc 1140 
cacagtgacc ctgagatgac tgaagaggag 1200 
atgcttctaa cagaaattct gctggatgtc 1260 
gatgcacacc gcaaagcaac gaaaggtgga 1320' 
gatgagagga gtgacagagg atctgagtca 1380 
cgaatccggc aaaaacagga agctttttgg 1440 
gataaacaga tggaagaaga aaagcagcaa 1500 
tttatccata aagagcaaaa tagtttatca 1560. 
gtggttaatg aaaagaagag aactccaaat 1620 
gagcataaag aaaaagaaaa acaaggaagg 1680 
tccagtagca atagcagaac tagtagtact 1740 
tctagctcag gtagtagtcg tacttcttct 1800 
cacagtagga gtagatctcc aacaatcaaa 1860 
cgcagaatta aaatagagag caatagggct 1920 
aatagaaata gcattgaaag agaaagacga 1980 
agaagtagaa gtcgctcaag ggatagacga 2040 
gataggcgta aaattgatga tcaacgtgga 2100 
ggtgaggcta aagaacaaga gaggaaaaag 2160 
aaaaagaaag acaaagaaag ggaacgtgaa 2220 
gaagaaaaag attttaagtt cagtagtcag 2280 
gaaagaacat tttctaggag tggttctata 2340 
caggatagta agaaaagtac taccaaagat 2400 
ggaaggagca gttctgagtc tccaggaagt 2460 
catagtcgat cgcgatccgt ggagaaatct 2520 
cacaagtcta agtcccgatc aaggtagtat 2580 
aaaaaattga ctgaatttat tcaaagttga 2640 
tcagtttggt acttgataaa taatcatagt 2700 
tttatttaaa attgcagatt tttaatttaa 2760 
tccctttttt tttcagatca acaacccctc 2820 
aagaatgcca tgatgttctt taaaaaattc 2880 
aggcacattg tggctgtgta ggtgaaacag 2940 
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